×
Neural Network Learns Spatial Mapping from Visual Experience, Mirroring Brain’s Cognitive Maps
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

The predictive coding neural network constructs an implicit spatial map of its environment by assembling information from local exploration into a global representation within its latent space.

Key takeaways: The network, trained on a next-image prediction task while navigating a virtual environment, automatically learns an internal map that quantitatively reflects spatial distances:

  • The network’s latent space encodes accurate spatial positions, enabling the agent to pinpoint its location using only visual information.
  • Distances between image representations in latent space correlate strongly with actual physical distances in the environment.
  • Individual latent space units exhibit localized, overlapping receptive fields akin to place cells in the mammalian brain, providing a unique code for each spatial position.

Comparing predictive coding to auto-encoding: The prediction task itself proves essential for spatial mapping, as a non-predictive auto-encoder network fails to distinguish visually similar but spatially distant locations:

  • The auto-encoder captures less positional information and exhibits weaker correlation between latent and physical distances compared to the predictive network.
  • Theoretically, there exist environments with visual symmetry that auto-encoders provably cannot map, while predictive coding succeeds.

Emergent properties supporting navigation: The predictive network’s latent space representation naturally supports vector-based navigation:

  • Differences in overlapping receptive field activations between two positions reliably encode the distance and direction to navigate between them.
  • This mechanism captures the majority of the network’s spatial information content.

Broader implications: Predictive coding provides a unified computational framework for constructing cognitive maps across sensory modalities:

  • The theoretical analysis generalizes to any temporally-correlated vector-valued sensory data, including auditory, tactile, and linguistic inputs.
  • Similar spatial and non-spatial cognitive maps in the brain suggest predictive coding as a universal neural mechanism for information representation and reasoning.
  • Results potentially connect theories of hippocampal place cells, cortical sensory maps, and the geometric organization of concepts in large language models.

In summary, this work mathematically and empirically demonstrates that predictive coding enables the automated construction of spatial cognitive maps purely from sensory experience, without relying on specialized or innate inference procedures. The model exhibits correspondences with neural representations and unifies perspectives on cognitive mapping across domains. However, the study does not experimentally validate the framework beyond visual inputs in a virtual setting. Further research exploring multi-modal predictive mapping and its implications for generalized reasoning would help establish predictive coding as a unifying theory of information representation in the brain and artificial systems.

Automated construction of cognitive maps with visual predictive coding

Recent News

Watch out, Google — Perplexity’s new Sonar API enables real-time AI search

The startup's real-time search technology combines current web data with competitive pricing to challenge established AI search providers.

AI agents are coming for higher education — here are the trends to watch

Universities are deploying AI agents to handle recruitment calls and administrative work, helping address staff shortages while raising questions about automation in education.

OpenAI dramatically increases lobbying spend to shape AI policy

AI firm ramps up Washington presence as lawmakers consider sweeping oversight of artificial intelligence sector.