×
Google DeepMind’s AI can now outperform top humans in math solving
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Google DeepMind has announced significant improvements to its AI geometry problem solver, AlphaGeometry2, which can now outperform gold medalists in the International Mathematical Olympiad (IMO). This advancement builds upon the original AlphaGeometry system that achieved silver-medal performance just a year ago.

Key developments: AlphaGeometry2 has demonstrated remarkable progress in solving complex geometric problems, successfully tackling 84% of IMO geometry problems from the past 25 years.

  • The system integrates Google’s advanced Gemini large language model and incorporates new capabilities like geometric object manipulation and linear equation solving
  • The AI can now move points along lines and adjust triangle heights, demonstrating sophisticated spatial reasoning abilities
  • AlphaGeometry2 significantly outperformed its predecessor, which solved 54% of historical IMO problems

Technical architecture: AlphaGeometry2 combines multiple sophisticated components to achieve human-level mathematical reasoning.

  • The system uses a specialized language model trained in formal mathematical language, enabling automatic verification of logical rigor
  • It employs a ‘neuro-symbolic’ system with human-coded abstract reasoning rather than purely data-driven learning
  • This hybrid approach helps prevent the “hallucinations” common in traditional AI chatbots

Competitive landscape: Other research teams are also making significant strides in mathematical AI capabilities.

  • Teams from India and China have achieved gold-medal-level performance in geometry using different approaches, though testing on a smaller set of problems
  • A $5-million AI Mathematical Olympiad Prize awaits the first open-source AI system to achieve overall gold-medal performance
  • Current DeepMind systems are not eligible for this prize as they are not open-source

Expert perspectives: Mathematical experts view these developments with measured optimism while acknowledging remaining challenges.

  • Kevin Buzzard of Imperial College London predicts AI will soon achieve perfect IMO scores
  • However, experts note that IMO problems, while difficult, are conceptually simpler than research-level mathematics
  • The next IMO in Sunshine Coast, Australia will provide a crucial test, as fresh problems eliminate any possibility of training data contamination

Looking ahead: The path forward for mathematical AI systems involves addressing increasingly complex problem types and demonstrating consistent performance.

  • DeepMind’s team plans to expand AlphaGeometry’s capabilities to handle inequalities and non-linear equations
  • These improvements will be necessary to “fully solve geometry” according to the research team
  • Systems must prove themselves on new, unseen problems to definitively demonstrate true problem-solving capabilities rather than pattern recognition

Future implications: While the achievement represents significant progress in mathematical AI, important distinctions remain between solving competition problems and advancing mathematical research.

Google's DeepMind AI Can Solve Math Problems on Par with Top Human Solvers

Recent News

Bit by the fit bug? Garmin launches Connect Plus subscription with AI fitness insights for $6.99 monthly

Garmin's new $6.99 monthly service enhances its free platform with AI-powered training analysis and real-time workout tracking.

OpenAI integrates image generation directly into GPT-4o, boosts creation of practical designs for common use

OpenAI's integrated image generation in GPT-4o allows for clearer text rendering and practical visuals like diagrams, marking a shift from artistic novelty toward everyday visual communication tools.

Hakimo raises $10.5M to expand AI-powered autonomous security monitoring

Hakimo's AI system aims to address security staffing shortages by continuously monitoring feeds and only involving human operators when necessary.