Skip to main content
Vai all'homepage della Commissione europea (si apre in una nuova finestra)
italiano it
CORDIS - Risultati della ricerca dell’UE
CORDIS

Communication in Action: Towards a model of Contextualized Action and Language Processing

Periodic Reporting for period 4 - CoAct (Communication in Action: Towards a model of Contextualized Action and Language Processing)

Periodo di rendicontazione: 2023-03-01 al 2024-11-30

Language is fundamental to human sociality. While the last century has given us many fundamental insights into how we use and understand it, core issues that we face when doing so within its natural environment—face-to-face conversation—remain untackled. When we speak we also send signals with our head, eyes, face, hands, torso, etc. How do we orchestrate and integrate all this information into meaningful messages? CoAct will lead to a new framework with situated language processing at its core. The defining characteristic of in situ language is its multimodal nature. Moreover, the essence of language use is social action; that is, we use language to do things—we question, offer, propose, decline etc. These social actions are embedded in conversational structure where one speaking turn follows another at a remarkable speed, with millisecond gaps between them. Conversation thus confronts us with a significant psycholinguistic challenge. While one could expect that the many co-speech bodily signals exacerbate this challenge, CoAct proposes that they actually play a key role in dealing with it. The results from a range of corpus studies based on Dutch casual face-to-face conversations as well as experiments confirm the hypotheses proposed in the project: visual signals like co-speech hand gestures and facial signals occur primarily early on during utterances, or at least prior to related information in the speech, and this leads to earlier response planning and faster overt responding, at least partly by feeding into the prediction of upcoming information. The project’s findings have further shown that also the meaning gleaned from speakers’ utterances is fundamentally influenced by the presence of different visual bodily signals, even when these accompany the same verbal message. These key insights advance current thinking in the langugae sciences by underlining the importance of reconceptualising human language as a multimodal phenomenon. Moreover, they critically advance psycholinguistic theory by demonstrating that the visual components of utterances in face-to-face settings directly influence the time course of linguistic processing, as well as the type of meaning recipients derive from it. In addition to empirical results, the project has also led to new theoretical frameworks that capture these ideas.
The project has led to a large corpus of casual conversations including audio, video and kinematic recordings.These data have been analysed to answer various questions relating to how people communicate intentions multimodally in social interaction, and how these multimodal signals feed into the process of creating and comprehending meaning in face-to-face conversational interaction. Moreover, the corpus data informed the formation of hypotheses we tested in experiments about multimodal intention communication, both in terms of producing communicative acts and how these are processed on a cognitive and neural level.Together these corpus and expeirmental studies have lead to a number of interesting insights, among them the following ones:
- People produced facial signals primarly early on in utterances, thus equipping them with predictive potential.
- There are relatively stable associations between specific facial signals and specific types of speaker intentions, such as eyebrow frowns being characteristic for questions, and eyebrow frowns being characteristic for repair initiations.
- Not only individual signals, but also their specific combinations, are perceived as typically associated with specific social actions. For example, the a verbal utterance accompanied by an eyebrow raise plus a forward tilt is primarily associated with requesting information while the same words accopanied by an eyebrow raise plus a sideward head-turn being primarily associated with skepticism.
- Together with their early timing, this means that they may provide recipients with early clues about the intention a speaker is intending to convey with a conversational turn.
- Hand gestures have also been shown to have predictive potential since they tend to precede closely semantically corresponding information in the speech, and behaviour and EEG exeriments have shown that seeing gestures indeed leads to processing advantages, and that semantic prediction is one core mechanism bringing this about.
- The project has also investigated pragmatic hand gestures. Variation in the form of these gestures, too, seem to associate with particular speaker intentions in our Dutch conversational corpus. Perception experiments with these gestures as stimuli are still in progress, especially trying to investigate whether their perception leads to faster intention recognition and to faster response planning in conversation as a consequence.
These various findings, and the theoretical frameworks they’ve led to, have been disseminated in a wide range of high-impact peer-reviewed academic journals from the fields of the cognitive and language sciences, and they have been presented at conferences and in keynote speeches as conferences in the fields of cognitive science, linguistics, psycholinguistics, and neuroscience. The research from this project also has signifcant coverage in the media, including press releases and longer features.
The project’s aim was to shed light on how people use their bodies in conjunction with speech to encode meaning in conversational interaction, and how these complex multimodal constructs are processed by the brain. The project significantly advanced our understanding of the communication of intentions (speech acts/social actions) on a behavioural, cognitive, neural and physiological level by using an innovative combination of techniques (combining corpus analyses, behavioural and neurocognitive experiments). The impact of the project thus went beyond mere empirical and theoretical novel insights, primarily by establishing the fusion of methods, and by developing experimental paradigms that capture many more of the social dimensions that characterize language use in its most natural environment, namely face-to-face communication. The project has led to pipeline development for anomating virtual agents, and for creating quasi-interactive set-ups that still allow for experimental control. Moreover, the project paved the way for establishing corpus-informed hypotheses as a a method leading to more ecologically valid experimental data. Finally, the project’s findings lay the foundation stones for investigating multimodal intention communication not only in healthy but also in neurotypically diverse populations (especially those groups who are often characterised by different pragmatic and multimodal communication abilities, such as in ASD or Parkinon’s disease). Also, the project’s findings lay the foundations for developing more naturalistically behaving virtual humans. Given the dramatic rise in the use of virtual communication technologies and the projected future developments in this domain, getting a better handle on how more naturalistically behaving virtual humans can benefit society (e.g. in therapeutic, educational or training environments) is critical.
Face-to-face conversation
Il mio fascicolo 0 0