×
Written by
Published on
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

The predictive coding neural network constructs an implicit spatial map of its environment by assembling information from local exploration into a global representation within its latent space.

Key takeaways: The network, trained on a next-image prediction task while navigating a virtual environment, automatically learns an internal map that quantitatively reflects spatial distances:

  • The network’s latent space encodes accurate spatial positions, enabling the agent to pinpoint its location using only visual information.
  • Distances between image representations in latent space correlate strongly with actual physical distances in the environment.
  • Individual latent space units exhibit localized, overlapping receptive fields akin to place cells in the mammalian brain, providing a unique code for each spatial position.

Comparing predictive coding to auto-encoding: The prediction task itself proves essential for spatial mapping, as a non-predictive auto-encoder network fails to distinguish visually similar but spatially distant locations:

  • The auto-encoder captures less positional information and exhibits weaker correlation between latent and physical distances compared to the predictive network.
  • Theoretically, there exist environments with visual symmetry that auto-encoders provably cannot map, while predictive coding succeeds.

Emergent properties supporting navigation: The predictive network’s latent space representation naturally supports vector-based navigation:

  • Differences in overlapping receptive field activations between two positions reliably encode the distance and direction to navigate between them.
  • This mechanism captures the majority of the network’s spatial information content.

Broader implications: Predictive coding provides a unified computational framework for constructing cognitive maps across sensory modalities:

  • The theoretical analysis generalizes to any temporally-correlated vector-valued sensory data, including auditory, tactile, and linguistic inputs.
  • Similar spatial and non-spatial cognitive maps in the brain suggest predictive coding as a universal neural mechanism for information representation and reasoning.
  • Results potentially connect theories of hippocampal place cells, cortical sensory maps, and the geometric organization of concepts in large language models.

In summary, this work mathematically and empirically demonstrates that predictive coding enables the automated construction of spatial cognitive maps purely from sensory experience, without relying on specialized or innate inference procedures. The model exhibits correspondences with neural representations and unifies perspectives on cognitive mapping across domains. However, the study does not experimentally validate the framework beyond visual inputs in a virtual setting. Further research exploring multi-modal predictive mapping and its implications for generalized reasoning would help establish predictive coding as a unifying theory of information representation in the brain and artificial systems.

Automated construction of cognitive maps with visual predictive coding

Recent News

The Library of Congress Has Become a Go-To Data Source for Companies Training AI Models

The Library's vast digital archives attract AI companies seeking diverse, copyright-free data to train language models.

AI Detection Tools Disadvantage Black Students, Study Finds

Black students are twice as likely to have their work falsely flagged as AI-generated, exacerbating existing disciplinary disparities in schools.

How Autodesk Boosted Efficiency by 63% with AI-Powered Customer Service

Autodesk deploys Salesforce's AI platform to boost customer service efficiency, cutting case handling time by 63%.