Periodic Reporting for period 2 - OBJECTPERMOD (Explaining object permanence with a deep recurrent neural network model of human cortical visual cognition)
Période du rapport: 2021-09-01 au 2022-08-31
During the reporting period, the project advanced significantly towards its objectives. Conceptual work was performed in the form of a systematic review of empirical phenomena and models of object perception and permanence in humans and machines spanning the disciplines of cognitive science, classical neural network modelling and modern engineering. A central outcome of this work is a perspective of how tasks should be designed in order to maximize scientific impact, drive innovation and foster close interactions between cognitive science and engineering (Peters & Kriegeskorte, 2021). Furthermore, a novel task set "FlyingObjects", designed to probe dynamic object vision in humans and machines was developed (Peters, Retchin, & Kriegeskorte, 2022). These tasks can be performed by both computational models and humans and thereby reveal how close computational models come in capturing the human phenomenology of object permanence. Tasks scale from simple, abstracted toy tasks, which have been traditionally employed in cognitive science, towards the highly naturalistic and complex tasks (e.g. videos) that engineering typically engages. Such a gamut of tasks in which all elements of the task can be controlled by the researcher affords us a principled understanding of object permanence in brains and models.
The second outcome of the conceptual work is a perspective on how deep neural networks can be valuable models in cognitive science (Peters & Kriegeskorte, 2021; Ma & Peters, 2020; Golan et al., 2023). Moreover, this action addressed one of the key question in vision science: "How does the primate brain combine generative and discriminative computations in vision?" in form of a Generative Adversarial Collaboration with leaders from cognitive neuroscience, cognitive science, and AI, leading to the publication of a white paper (preprint at https://doi.org/10.48550/arXiv.2401.06005(s’ouvre dans une nouvelle fenêtre)).
A key outcome of the action was the development of computational models of human object vision. HBox is a superset family of neural network architectures designed to study architectural motifs in neural network and disentangle computational depth from receptive field size in computational models. This model family reveals fundamental principles of the human visual system and to bridge the gap between modern neural network models and classic visual neuroscience theory and was evaluated on a 7T fMRI dataset of humans observing natural images (Peters, Stoffl, & Kriegeskorte, 2022). Furthermore, novel computational models of object persistence were developed that captured human behavior and eye movements while tracking multiple objects through occlusion (Peters, Butkus, & Kriegeskorte, 2022; Peters*, Butkus*, & Kriegeskorte, 2023).