×
OpenAI’s o3 model is acing AI reasoning tests–but it’s still not AGI
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

The race for artificial general intelligence (AGI) continues as OpenAI’s latest o3 model achieves remarkable scores on a key reasoning test, though experts maintain it falls short of true human-level intelligence.

Breaking development: OpenAI’s new o3 model has achieved a breakthrough score of 75.7% on the Abstraction and Reasoning Corpus (ARC) Challenge, a test designed to evaluate AI systems’ pattern recognition and reasoning capabilities.

  • The model demonstrated unprecedented task adaptation abilities not previously seen in GPT-family models
  • The official score was achieved within the competition’s computing cost limit of $20 per puzzle task
  • An unofficial score of 87.5% was reached using significantly more computing power, surpassing the typical human score of 84%

Technical details and constraints: The ARC Challenge tests AI systems’ ability to identify patterns in colored grid puzzles while operating within specific computational limitations.

  • The “semi-private” test, used for public rankings, allows computing costs up to $10,000 total
  • A more stringent “private” test, used for determining grand prize winners, limits computing costs to 10 cents per task
  • O3’s unofficial high score required 172 times more computing power than its official attempt, with costs reaching thousands of dollars per task

Expert perspectives: Leading AI researchers and competition organizers maintain that while impressive, this achievement does not constitute AGI.

  • François Chollet, ARC Challenge creator, describes it as an important milestone but not AGI
  • Melanie Mitchell of the Santa Fe Institute argues that solving tasks through computational brute force defeats the challenge’s purpose
  • Thomas Dietterich from Oregon State University notes that commercial AI systems still lack crucial components of human cognition, including episodic memory and meta-cognition

Industry implications: The achievement comes during a period of perceived slowdown in AI advancement compared to the rapid developments of 2023.

  • The results suggest AI models could soon legitimately beat the competition benchmark
  • Multiple submissions have already scored above 81% on the private evaluation test set
  • Competition organizers are planning a more challenging benchmark test for 2025

Looking ahead: While o3’s performance represents significant progress in AI capabilities, key questions remain about the model’s methodology and true understanding of the tasks it completes.

  • Researchers await open-source replication to fully evaluate the achievement’s significance
  • The ARC Prize 2025 challenge continues until someone achieves the grand prize with an open-source solution
  • The gap between computational problem-solving and true human-like reasoning remains a central challenge in AI development
OpenAI's o3 model aced a test of AI reasoning – but it's still not AGI

Recent News

The first mini PC with CoPilot Plus and Intel Core Ultra processors is here

Asus's new mini PC integrates dedicated AI hardware and Microsoft's Copilot Plus certification into a Mac Mini-sized desktop computer.

Leap Financial secures $3.5M for AI-powered global payments

Tech-driven lenders are helping immigrants optimize their income and credit by tracking remittances and financial flows to their home countries.

OpenAI CEO Sam Altman calls former business partner Elon Musk a ‘bully’

The legal battle exposes growing friction between Silicon Valley's competing visions for ethical AI development and corporate governance.