Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

From Meaning to Perception: How narratives shape visual processing

Periodic Reporting for period 1 - VisualNarrative (From Meaning to Perception: How narratives shape visual processing)

Reporting period: 2023-11-01 to 2025-10-31

Humans constantly generate expectations of what will happen next. This allows us to process overwhelming sensory input more efficiently. However, to date most of studies examining the influence of expectations on sensory processing have used rather oversimplified paradigms which lack higher-order meaning, like simple stimuli associations. To overcome this barrier, this project aimed to assess how more cognitive, meaning-based expectations influence visual processing. To this aim, I used picture books as a more ecologically valid paradigm and to experimentally still manipulate the meaningfulness of stimuli (whether subsequent pictures from the book were presented in a coherent, i.e. correct order, or a shuffled order where the meaning was lost). I used state-of-the-art large language models to quantify expectations in such stimulus and relate it to visual processing. Together, this project bridged from more higher-order cognition to lower-level visual processing.
I have performed an eye-tracking experiment where I presented the picture books to individuals, either in a coherent or shuffled order while measuring their eye movements during free viewing. Then I used GPT-2, a large language model to quantify, from separately obtained language narratives, what was semantically salient in images, in order to understand the narratives. In parallel, I used a state-of-the-art model of visual salience (DeepGaze-II) to assess what was salient from a visual standpoint and then assessed how this related to participants gaze behaviour. I found that my new, semantic salience model, could account for differences in eye movement behaviour depending on the condition in which images were presented - when they made up a coherent sequence, participants viewed both relatively more and relatively longer at objects that were semantically salient in images. Visual salience did not capture differences in gaze behaviour between conditions. This shows that the state-of-the-art models of visual salience lack some explanatory power when accounting for gaze behaviour, especially in more naturalistic, meaningful settings. In turn, language models could offer a useful avenue forward to in general capture cognitive influences on visual processing.
The main result of the project was that semantic salience model, as derived from language, can account for differences in eye movement behaviour, and better explain gaze behaviour in more meaningful settings.
This could lead to much further research, especially extending the paradigm into even more naturalistic settings, such as movie watching or navigation. To ensure that, the results were presented at multiple conferences and right now, the manuscript is in review for publication.
My booklet 0 0