×
Neural Network Learns Spatial Mapping from Visual Experience, Mirroring Brain’s Cognitive Maps
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

The predictive coding neural network constructs an implicit spatial map of its environment by assembling information from local exploration into a global representation within its latent space.

Key takeaways: The network, trained on a next-image prediction task while navigating a virtual environment, automatically learns an internal map that quantitatively reflects spatial distances:

  • The network’s latent space encodes accurate spatial positions, enabling the agent to pinpoint its location using only visual information.
  • Distances between image representations in latent space correlate strongly with actual physical distances in the environment.
  • Individual latent space units exhibit localized, overlapping receptive fields akin to place cells in the mammalian brain, providing a unique code for each spatial position.

Comparing predictive coding to auto-encoding: The prediction task itself proves essential for spatial mapping, as a non-predictive auto-encoder network fails to distinguish visually similar but spatially distant locations:

  • The auto-encoder captures less positional information and exhibits weaker correlation between latent and physical distances compared to the predictive network.
  • Theoretically, there exist environments with visual symmetry that auto-encoders provably cannot map, while predictive coding succeeds.

Emergent properties supporting navigation: The predictive network’s latent space representation naturally supports vector-based navigation:

  • Differences in overlapping receptive field activations between two positions reliably encode the distance and direction to navigate between them.
  • This mechanism captures the majority of the network’s spatial information content.

Broader implications: Predictive coding provides a unified computational framework for constructing cognitive maps across sensory modalities:

  • The theoretical analysis generalizes to any temporally-correlated vector-valued sensory data, including auditory, tactile, and linguistic inputs.
  • Similar spatial and non-spatial cognitive maps in the brain suggest predictive coding as a universal neural mechanism for information representation and reasoning.
  • Results potentially connect theories of hippocampal place cells, cortical sensory maps, and the geometric organization of concepts in large language models.

In summary, this work mathematically and empirically demonstrates that predictive coding enables the automated construction of spatial cognitive maps purely from sensory experience, without relying on specialized or innate inference procedures. The model exhibits correspondences with neural representations and unifies perspectives on cognitive mapping across domains. However, the study does not experimentally validate the framework beyond visual inputs in a virtual setting. Further research exploring multi-modal predictive mapping and its implications for generalized reasoning would help establish predictive coding as a unifying theory of information representation in the brain and artificial systems.

Automated construction of cognitive maps with visual predictive coding

Recent News

Baidu reports steepest revenue drop in 2 years amid slowdown

China's tech giant Baidu saw revenue drop 3% despite major AI investments, signaling broader challenges for the nation's technology sector amid economic headwinds.

How to manage risk in the age of AI

A conversation with Palo Alto Networks CEO about his approach to innovation as new technologies and risks emerge.

How to balance bold, responsible and successful AI deployment

Major companies are establishing AI governance structures and training programs while racing to deploy generative AI for competitive advantage.