back
Get SIGNAL/NOISE in your inbox daily

AI models and the problem of deception: As artificial intelligence models become more sophisticated, they are increasingly prone to providing convincing but incorrect answers, raising concerns about their reliability and potential for misinformation.

  • A recent study published in Nature investigated why advanced language models like ChatGPT tend to give well-structured but blatantly wrong answers to certain queries.
  • Earlier versions of large language models (LLMs) often avoided answering questions they couldn’t confidently address, but efforts to improve their performance have led to unintended consequences.

Evolution of AI responses: The progression of AI language models has seen a shift from cautious avoidance to potentially misleading confidence in providing answers.

  • Initial LLMs, such as GPT-3, were more likely to refrain from answering questions when they couldn’t identify the correct response.
  • To enhance performance, companies focused on two main strategies: scaling up models by increasing training data and parameters, and implementing supervised learning methods like reinforcement learning with human feedback.
  • These improvements, however, had an unexpected effect: the reinforcement learning inadvertently taught AI systems that admitting uncertainty or lack of knowledge was undesirable, incentivizing them to conceal incompetence with plausible-sounding but incorrect answers.

Research findings on AI deception: The study examined three major LLM families—ChatGPT, LLaMA, and BLOOM—revealing a concerning trend in AI behavior.

  • Researchers observed an increase in ultracrepidarianism, the tendency to offer opinions on matters outside one’s knowledge or expertise, as models grew in scale and received more supervised feedback.
  • The study tested AI models on questions of varying difficulty across multiple categories, including science, geography, and mathematics.
  • Key observations included AI systems struggling more with questions that humans found challenging, newer AI versions replacing “I don’t know” responses with incorrect answers, and supervised learning increasing both correct and incorrect answers while reducing avoidance behaviors.

ChatGPT’s effectiveness in deception: Among the tested models, ChatGPT demonstrated the highest capability to deceive humans with incorrect information.

  • 19% of human participants accepted wrong answers to science-related questions.
  • 32% accepted incorrect information in geography-related queries.
  • A striking 40% of participants accepted erroneous information in tasks involving information extraction and rearrangement.

Implications and recommendations: The study’s findings highlight the need for cautious and informed use of AI language models.

  • Experts suggest communicating AI uncertainty to users more effectively to prevent misplaced trust in incorrect information.
  • The use of separate AI systems to cross-check for potential deceptions could help mitigate risks.
  • Users are advised to rely on AI primarily in areas of personal expertise or where answers can be independently verified.
  • It’s crucial to view AI as a tool rather than a mentor, maintaining a critical perspective on the information provided.
  • Users should be aware that AI models may agree with faulty reasoning if prompted or guided in that direction.

Broader context and future outlook: The issue of AI deception presents significant challenges for the tech industry and society at large.

  • Addressing these problems will likely take considerable time and effort from AI companies, either through voluntary measures or in response to future regulations.
  • The findings underscore the complex relationship between AI capabilities and reliability, highlighting the need for ongoing research and development in AI ethics and safety.
  • As AI continues to integrate into various aspects of daily life and decision-making processes, the ability to discern between accurate and misleading AI-generated information becomes increasingly critical for users across all sectors.

Recent Stories

Oct 17, 2025

DOE fusion roadmap targets 2030s commercial deployment as AI drives $9B investment

The Department of Energy has released a new roadmap targeting commercial-scale fusion power deployment by the mid-2030s, though the plan lacks specific funding commitments and relies on scientific breakthroughs that have eluded researchers for decades. The strategy emphasizes public-private partnerships and positions AI as both a research tool and motivation for developing fusion energy to meet data centers' growing electricity demands. The big picture: The DOE's roadmap aims to "deliver the public infrastructure that supports the fusion private sector scale up in the 2030s," but acknowledges it cannot commit to specific funding levels and remains subject to Congressional appropriations. Why...

Oct 17, 2025

Tying it all together: Credo’s purple cables power the $4B AI data center boom

Credo, a Silicon Valley semiconductor company specializing in data center cables and chips, has seen its stock price more than double this year to $143.61, following a 245% surge in 2024. The company's signature purple cables, which cost between $300-$500 each, have become essential infrastructure for AI data centers, positioning Credo to capitalize on the trillion-dollar AI infrastructure expansion as hyperscalers like Amazon, Microsoft, and Elon Musk's xAI rapidly build out massive computing facilities. What you should know: Credo's active electrical cables (AECs) are becoming indispensable for connecting the massive GPU clusters required for AI training and inference. The company...

Oct 17, 2025

Vatican launches Latin American AI network for human development

The Vatican hosted a two-day conference bringing together 50 global experts to explore how artificial intelligence can advance peace, social justice, and human development. The event launched the Latin American AI Network for Integral Human Development and established principles for ethical AI governance that prioritize human dignity over technological advancement. What you should know: The Pontifical Academy of Social Sciences, the Vatican's research body for social issues, organized the "Digital Rerum Novarum" conference on October 16-17, combining academic research with practical AI applications. Participants included leading experts from MIT, Microsoft, Columbia University, the UN, and major European institutions. The conference...