back
Get SIGNAL/NOISE in your inbox daily

AI’s struggle with visual reasoning puzzles: Recent research from the USC Viterbi School of Engineering Information Sciences Institute (ISI) tested the ability of multi-modal large language models (MLLMs) to solve abstract visual puzzles similar to those found on human IQ tests, revealing significant limitations in AI’s cognitive abilities.

  • The study, presented at the Conference on Language Modeling (COLM 2024) in Philadelphia, focused on evaluating the nonverbal abstract reasoning abilities of both open-source and closed-source MLLMs.
  • Researchers used puzzles developed from Raven’s Progressive Matrices, a standard type of abstract reasoning test, to challenge the AI models’ visual perception and logical reasoning skills.
  • The tests required AI models to identify patterns and apply them to different scenarios, such as recognizing that a yellow circle turning into a blue triangle represents a specific transformation.

Performance disparities between AI models: The study revealed significant differences in performance between open-source and closed-source AI models, with the latter demonstrating superior capabilities in visual reasoning tasks.

  • Open-source models generally struggled more with visual reasoning puzzles compared to their closed-source counterparts.
  • GPT-4V, a closed-source model, showed relatively good reasoning abilities, although it still fell short of human-level performance.
  • Researchers attribute the better performance of closed-source models to factors such as specialized development, training on larger datasets, and access to greater computing resources from private companies.

Improving AI performance: The research team explored methods to enhance the AI models’ problem-solving abilities, with some success in guiding their reasoning processes.

  • Chain of Thought prompting, a technique that guides models step-by-step through the reasoning portion of the test, helped improve performance for some AI models.
  • This approach demonstrates the potential for developing more effective strategies to enhance AI’s cognitive abilities in the future.

Implications for AI development: The study’s findings highlight the current limitations of AI in abstract reasoning tasks and underscore the importance of continued research in this area.

  • Jay Pujara, research associate professor and study author, emphasized the need to understand AI models’ limitations to make them better, safer, and more useful.
  • By identifying weaknesses in AI’s reasoning abilities, this research can help direct future efforts to develop more advanced and capable AI systems.
  • The goal of achieving human-level logic in AI remains a significant challenge, with current models still far from matching human cognitive abilities in complex reasoning tasks.

Broader context of AI capabilities: This study contributes to the ongoing assessment of AI’s strengths and weaknesses across various cognitive domains.

  • While AI has shown remarkable progress in certain areas, such as natural language processing and image recognition, abstract reasoning remains a significant hurdle.
  • The research highlights the complexity of human cognition and the challenges involved in replicating these abilities in artificial systems.
  • As AI continues to advance, understanding its limitations becomes crucial for responsible development and deployment in real-world applications.

Looking ahead: Challenges and opportunities: The study’s results open up new avenues for AI research and development, while also raising important questions about the future of artificial intelligence.

  • Researchers may focus on developing more sophisticated training methods and architectures to improve AI’s abstract reasoning capabilities.
  • The disparity between open-source and closed-source models’ performance may fuel discussions about access to resources and the potential for a widening gap in AI capabilities.
  • As AI systems become more advanced, ongoing evaluation of their cognitive abilities will be essential to ensure they are deployed safely and effectively in various domains.

Recent Stories

Oct 17, 2025

DOE fusion roadmap targets 2030s commercial deployment as AI drives $9B investment

The Department of Energy has released a new roadmap targeting commercial-scale fusion power deployment by the mid-2030s, though the plan lacks specific funding commitments and relies on scientific breakthroughs that have eluded researchers for decades. The strategy emphasizes public-private partnerships and positions AI as both a research tool and motivation for developing fusion energy to meet data centers' growing electricity demands. The big picture: The DOE's roadmap aims to "deliver the public infrastructure that supports the fusion private sector scale up in the 2030s," but acknowledges it cannot commit to specific funding levels and remains subject to Congressional appropriations. Why...

Oct 17, 2025

Tying it all together: Credo’s purple cables power the $4B AI data center boom

Credo, a Silicon Valley semiconductor company specializing in data center cables and chips, has seen its stock price more than double this year to $143.61, following a 245% surge in 2024. The company's signature purple cables, which cost between $300-$500 each, have become essential infrastructure for AI data centers, positioning Credo to capitalize on the trillion-dollar AI infrastructure expansion as hyperscalers like Amazon, Microsoft, and Elon Musk's xAI rapidly build out massive computing facilities. What you should know: Credo's active electrical cables (AECs) are becoming indispensable for connecting the massive GPU clusters required for AI training and inference. The company...

Oct 17, 2025

Vatican launches Latin American AI network for human development

The Vatican hosted a two-day conference bringing together 50 global experts to explore how artificial intelligence can advance peace, social justice, and human development. The event launched the Latin American AI Network for Integral Human Development and established principles for ethical AI governance that prioritize human dignity over technological advancement. What you should know: The Pontifical Academy of Social Sciences, the Vatican's research body for social issues, organized the "Digital Rerum Novarum" conference on October 16-17, combining academic research with practical AI applications. Participants included leading experts from MIT, Microsoft, Columbia University, the UN, and major European institutions. The conference...