Skip to main content
Weiter zur Homepage der Europäischen Kommission (öffnet in neuem Fenster)
Deutsch Deutsch
CORDIS - Forschungsergebnisse der EU
CORDIS

Ecological Language: A multimodal approach to language and the brain

Periodic Reporting for period 4 - ECOLANG (Ecological Language: A multimodal approach to language and the brain)

Berichtszeitraum: 2022-07-01 bis 2024-06-30

Language is learnt and mostly used in face-to-face contexts in which multiple cues contribute to learning and comprehension. These cues include what is being said, intonation of the voice, the face (especially mouth movements), and the hand-gestures that are time-locked and related to what is being said (such as pointing to an object while talking about it, or gestures that evoke imagery of what is being said). Although the importance of caregivers’ non-verbal behaviours has long been acknowledged, it is still the case that so much of our knowledge of the psychological and neural mechanisms underscoring how language is learnt and used comes almost exclusively from experimental studies focusing on speech or text, in which the rich multimodal context is not taken into account.

The ECOLANG project studied language learning and comprehension in the real-world, investigating language as the ensemble of the multimodal cues that accompany speech in face-to-face communication. We asked whether and how do children and adults use the multimodal cues to learn new vocabulary and to process known words. We further asked how the brain integrates the different types of multimodal information during comprehension. Answering these questions is key to developing new treatments for developmental and acquired language disorders, and provides important constraints for automatic language processing, leading to improved performance by automatic language recognition and dialogue systems.

By developing, annotating and analysing a large corpus of dyadic conversation, in carefully crafted real-life like scenarios, we have been able to characterise when speakers use specific verbal and non-verbal (e.g. gestures or eyegaze) behaviours, and how these behaviours are associated with a partner’s (child or adult) general ease of comprehension as well as their learning of new vocabulary and concepts. By carrying out behavioural, electrophysiological and imaging studies, we have been able to identify how the different multimodal behaviours are integrated with speech online during comprehension and how those regions in the brain processing words and sentences dynamically coordinate with regions processing hand and facial movements. Altogether, ECOLANG has provided a clear first snapshot of how speakers and comprehenders (children or adults) use multimodal language in face-to-face contexts and how processing of multimodal language differs from that of spoken or written language.
We have collected, annotated and prepared for public release the ECOLANG corpus: a large multimodal corpus of the verbal and non-verbal (speech, gestures, eye-gaze) behaviours by a speaker in interaction with a child or adult partner. Conversations across all dyads are comparable and centre around life-like scenarios. To our knowledge, this will be (public release is imminent) the first large (more than 50 hrs of recording) corpus available annotated for all these different behaviours and we expect it to be useful to researchers from a variety of disciplines.

Analyses of the corpus data have allowed us to make significant progress on understanding when and why speakers use non-verbal behaviours in addition to speech. Overall, speakers use specific non-verbal cues when they are most useful to their listeners. For example, they use iconic non-verbal behaviours (iconic gestures or vocal iconicity) when talking about objects that are not in view, when they talk about novel objects, and when they are about to say a word that is less predictable in context. We were then also able to link these behaviours to the learning of new words and concepts by the listener (child or adult), finding that a small set of non-verbal behaviours (especially points and iconic gestures) support learning in interaction with other linguistic variables, while different non-verbal behaviours (manipulation of objects) instead hinder learning. These findings provide a clear set of “do’s and don’t’s” for improving successful teaching.

In behavioural and electrophysiological studies, we assessed the impact of multimodal cues on word and discourse processing. We found that informative (i.e. more iconic) gestures always speed up processing, while seeing the speaker’s mouth movements only helps when processing is difficult (either because there is noise or the listener is non-native). In electrophysiological studies, we found that all the multimodal cues we investigated affect word processing, indicating that they are central to language processing. We also found that their impact dynamically changes depending upon their informativeness. These studies provide a first snapshot into how the brain dynamically weights audiovisual cues in language comprehension. Finally, in work with people with aphasia, we identified the neural regions involved in integrating speech, iconic gestures and mouth movements. These results are clinically relevant as they provide insight into who can benefit from audio-visual treatment.
ECOLANG pioneered a new way to empirically study spoken language. We study real-world language as a multimodal phenomenon: the input comprises linguistic information, but also prosody, mouth movements, eye gaze and gestures. This contrasts with other current approaches where either language is reduced to speech or text, or only one specific non-verbal behaviour is considered (gesture or prosody or eyegaze). By bringing in the multimodal context, we blur the traditional distinction between language and communication currently present in linguistics, psychology, and neurobiology of language. Crucially, we also study language in real-world settings in which interlocutors are adults or children and learning new words is intertwined with processing known words. This contrasts with current approaches in which language processing is studied in adults and language learning is studied in children.
The project demonstrated the usefulness of this approach as it made important discoveries concerning: (a) how the different cues dynamically interact during processing, using evidence from behavioural, electrophysiological and imaging studies; (b) how they are used in learning and processing by children and adults (and what the differences between these two groups are).
Design and example screen shots from corpus collection
Logo for the Project
Mein Booklet 0 0