Skip to main content
Aller à la page d’accueil de la Commission européenne (s’ouvre dans une nouvelle fenêtre)
français français
CORDIS - Résultats de la recherche de l’UE
CORDIS

Explaining object permanence with a deep recurrent neural network model of human cortical visual cognition

Periodic Reporting for period 2 - OBJECTPERMOD (Explaining object permanence with a deep recurrent neural network model of human cortical visual cognition)

Période du rapport: 2021-09-01 au 2022-08-31

Visual cognition is our ability to recognize the things we see around us and make inferences about their meaning and relationships. A hallmark step in the development of human visual cognition is the acquisition of object permanence. Object permanence is the ability to continue to mentally represent an object that has disappeared from view – for example because it is hidden behind another object. More generally, mental representations of objects that are currently not present underlie complex cognitive processes such as reasoning and planning. While object permanence is fundamental to human vision, its computational mechanisms remain a mystery. To test how these mechanisms give rise to object permanence in the brain, we need to build computational models of cortical processing. Deep convolutional neuronal network (CNN) models now achieve human-level performance on a range of visual tasks. They have advanced our understanding of human and primate visual cognition, providing a crucial link between the disciplines of psychology, neuroscience, and artificial intelligence. Current deep neural network models of vision however lack the fundamental ability of object permanence, limiting their power as models of human visual cognition and as artificially intelligent systems. This project has made significant progress in our understanding of human and machine object vision, yielded novel tasks sets for dynamic object vision and humans and machines driving progress in cognitive computational neuroscience and AI, and developed novel computational neural network models of human object perception.
Work performed during the reporting period consisted of three parts: Conceptual work, development of novel task sets for humans and neural network models, and development of computational models of human object vision, object permanence, and object tracking.

During the reporting period, the project advanced significantly towards its objectives. Conceptual work was performed in the form of a systematic review of empirical phenomena and models of object perception and permanence in humans and machines spanning the disciplines of cognitive science, classical neural network modelling and modern engineering. A central outcome of this work is a perspective of how tasks should be designed in order to maximize scientific impact, drive innovation and foster close interactions between cognitive science and engineering (Peters & Kriegeskorte, 2021). Furthermore, a novel task set "FlyingObjects", designed to probe dynamic object vision in humans and machines was developed (Peters, Retchin, & Kriegeskorte, 2022). These tasks can be performed by both computational models and humans and thereby reveal how close computational models come in capturing the human phenomenology of object permanence. Tasks scale from simple, abstracted toy tasks, which have been traditionally employed in cognitive science, towards the highly naturalistic and complex tasks (e.g. videos) that engineering typically engages. Such a gamut of tasks in which all elements of the task can be controlled by the researcher affords us a principled understanding of object permanence in brains and models.

The second outcome of the conceptual work is a perspective on how deep neural networks can be valuable models in cognitive science (Peters & Kriegeskorte, 2021; Ma & Peters, 2020; Golan et al., 2023). Moreover, this action addressed one of the key question in vision science: "How does the primate brain combine generative and discriminative computations in vision?" in form of a Generative Adversarial Collaboration with leaders from cognitive neuroscience, cognitive science, and AI, leading to the publication of a white paper (preprint at https://doi.org/10.48550/arXiv.2401.06005(s’ouvre dans une nouvelle fenêtre)).

A key outcome of the action was the development of computational models of human object vision. HBox is a superset family of neural network architectures designed to study architectural motifs in neural network and disentangle computational depth from receptive field size in computational models. This model family reveals fundamental principles of the human visual system and to bridge the gap between modern neural network models and classic visual neuroscience theory and was evaluated on a 7T fMRI dataset of humans observing natural images (Peters, Stoffl, & Kriegeskorte, 2022). Furthermore, novel computational models of object persistence were developed that captured human behavior and eye movements while tracking multiple objects through occlusion (Peters, Butkus, & Kriegeskorte, 2022; Peters*, Butkus*, & Kriegeskorte, 2023).
The project is has lead to a significant advancement of our understanding of how visual cognition is implemented in the brain and thereby contributes to our mechanistic understanding of the human mind. A key impact is the development of shared task sets for cognitive neuroscience and AI. These tasks will drive progress in both fields to align AI with human cognition and use modern AI for the understanding of human mind and brain.
objectpermod.png
Mon livret 0 0