×
AI excels at identifying geographical locations but struggles with objects in retro games
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

The curious gap between AI’s geographic prowess and its struggles with pixelated video games highlights an intriguing inconsistency in current visual AI capabilities. While some large models like OpenAI’s o3 excel at identifying locations from photographs with minimal visual cues, they simultaneously struggle with seemingly simpler tasks like recognizing objects in vintage games. This discrepancy reveals important insights about how artificial intelligence processes different types of visual information and where current models may have unexpected blind spots.

The puzzle: Current AI models demonstrate contradictory visual recognition abilities that don’t align with human intuition.

  • Large language models like o3 perform remarkably well at GeoGuessr, identifying global locations even from seemingly featureless landscapes.
  • Yet these same models struggle with visually simpler tasks like identifying staircases or doors in Pokemon Red screenshots, even when clearly marked.

Possible explanations: The contradiction likely stems from differences in training data and visual processing approaches.

  • Geographic images were likely abundant in training data, allowing models to recognize subtle regional differences in vegetation, terrain, and environmental features.
  • Retro game visuals represent a specialized domain with pixel art that may be underrepresented in training datasets despite appearing visually simpler to humans.

Human vs. AI perception: This discrepancy highlights fundamental differences in how humans and AI systems process visual information.

  • Humans find navigating stylized game worlds intuitive because we understand symbolic representation and easily grasp visual abstractions.
  • AI models may excel at tasks they’ve been extensively exposed to through training data while showing surprising weaknesses in domains that seem simpler but are less represented.

The significance: These contradictory capabilities reveal important insights about current AI visual systems.

  • The uneven performance across different visual domains suggests AI visual understanding remains brittle and domain-specific rather than generalizable.
  • This phenomenon demonstrates how AI capabilities don’t necessarily develop along the same trajectory as human visual cognition, creating unexpected strengths and weaknesses.
What's up with AI's vision

Recent News

AI courses from Google, Microsoft and more boost skills and résumés for free

As AI becomes critical to business decision-making, professionals can enhance their marketability with free courses teaching essential concepts and applications without requiring technical backgrounds.

Veo 3 brings audio to AI video and tackles the Will Smith Test

Google's latest AI video generation model introduces synchronized audio capabilities, though still struggles with realistic eating sounds when depicting the celebrity in its now-standard benchmark test.

How subtle biases derail LLM evaluations

Study finds language models exhibit pervasive positional preferences and prompt sensitivity when making judgments, raising concerns for their reliability in high-stakes decision-making contexts.