Skip to main content
European Commission logo
español español
CORDIS - Resultados de investigaciones de la UE
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary

Language dynamics: a neurocognitive approach to incremental interpretation

Periodic Reporting for period 4 - LANGDYN (Language dynamics: a neurocognitive approach to incremental interpretation)

Período documentado: 2020-04-01 hasta 2022-03-31

A fundamental problem in the language sciences is to explain the remarkable immediacy of human speech comprehension. Listeners understand spoken language literally as they hear it, seamlessly mapping the incoming speech signal onto a rich interpretation of what the speaker is saying. This rapid, continuous interpretative process is based on complex dynamic processes that relate the speech input to different types of neurally represented knowledge about language and about the broader context of speaking and understanding (see Figure 1). The LANGDYN project addresses the key question of how these processes are instantiated in the human brain.

To do so, we pioneered a novel combination of emerging imaging technologies and analysis methods to directly investigate the real-time properties of speech comprehension. Using combined EEG and MEG (EMEG) to capture brain-wide neural activity as listeners heard spoken language, we integrated advanced statistical analysis methods with computational models of linguistic and non-linguistic knowledge to map out the specific neurocomputational content of these dynamic neural processes. This has led to a uniquely detailed analysis of what processes are taking place where and when in the human brain.
The research programme segmented the speech interpretation process into three processing domains. Stage 1 involves the access to neural representations of the sound and meaning of spoken words. To identify the basic architecture of this process, we focused on words heard in isolation (Kocagoncu et al, 2017; Clarke et al, 2018; Marslen-Wilson et al, 2021), tracking the real-time patterning of neurocomputational activity across the brain as words are heard and recognised. An early distributed and non-hierarchical analysis leads within 150 msec of word onset to activation of possible words in bilateral temporal areas, and within 300-400 msec to selection of the meaning of the actual word being heard.

The second set of studies (Klimovich-Gray et al, 2019; Fang et al, 2021) asked how the basic processes of word recognition interact with the context in which words are heard. How can the access of word meaning in context be achieved so early – typically within 250 ms of word onset - relative to the timing of information flow from the speech signal? Our studies reveal a novel functionally specialised left-hemisphere network supporting a continuous interaction between constraints provided by the speech signal and by the meaning and structure of the utterance being heard. This interaction evolves over the first 100-150 ms of the word from a more generic, contextually defined mode of constraint integration to a less contextually determined outcome that reflects the specific semantics of the words being heard.

A further pioneering study (Lyu et al, 2019) shows directly that only the contextually relevant meanings of a word become integrated into the current interpretation of the utterance. For a sentence like “The elderly man ate the apple”, where the subject noun (‘man’) and the verb (‘ate’) constrain towards an object noun that is related to food and eating, only the food-related semantic properties of the word (‘apple’) are activated. This rapid selection of context-relevant word meaning reflects the early intervention of generic contextual constraints, upregulating contextually relevant word meanings even before the specific semantics of a word is available. Novel pattern-based measures of directed connectivity across the language network reveal a continuous information flow among left hemisphere temporal, frontal, and parietal brain regions, underpinning this contextual modification of the object noun’s semantics and delineating a broader neural substrate for real-time speech interpretation.

This broader substrate was investigated by Choi et al (2019) and Lyu et al (2021), using the same pioneering mixture of analysis techniques. Choi et al (2019) focused on the role of semantic constraint elicited by the incrementally developing context in sentences such as “The experienced walker chose the path”. Quantitative computational linguistic models were generated of the semantic constraints associated with each major part of the sentence (subject noun phrase – ‘the experienced walker’; verb – ‘chose’; object noun – ‘the path’) so that we could track the spatiotemporal pattern of model fit for each processing dimension. The results reveal an extensive bihemispheric neural system that generates incremental constraints on the message-level interpretation of each successive word. We see the early activation of a right-hemisphere fronto-temporal network generating possible interpretative scenarios as the subject noun phrase is heard. These recruit a LH fronto-temporal network as the scenarios are constrained by the following verb and terminate in a posterior LH network underpinning the early recognition of the object noun.

These sequential temporal relationships among multiple brain regions demonstrate that the incoming speech is directly interpreted in terms of the listener’s general knowledge of the world. To study this in the LANGDYN context requires computational models of the multi-faceted probabilistic constraints imposed by this broader knowledge environment. In a ground-breaking further study, Lyu et al (2021) exploited the new generation of AI language models to determine whether they could provide neurocognitively relevant computational models of these constraints. Lyu et al (2021) analysed the internal states of a prominent deep language system (BERT) to extract representations of the dynamic structural interpretation unfolding over time in the BERT model space. The resulting computational models, separating out linguistic structural constraints from probabilistic non-linguistic constraints, were tested against human listeners’ brain activity as they processed the same sentences. We found a strong fit for both types of model, exhibiting patterns of activation and connectivity that validate and extend our earlier findings, confirming that the construction of sentential structures in the LH is integrated with general event knowledge from the RH.
The LANGDYN project reveals, with unique spatiotemporal specificity, the sequential neurocomputational substrate underpinning incremental interpretation, and suggests a strong algorithmic commonality with recent complex AI NLP systems. Critically, these successes demonstrate the viability and the scientific value of a fully interdisciplinary approach that focuses just as strongly on the specific neurocomputational content of what is being computed as on the spatiotemporal patterning (the where and the when) of brain activity that has been the focus of the more conventional studies that have previously defined the state of the art.

The impacts of this new style of research in the cognitive neuroscience of language are still largely within the relevant academic communities. Looking forward, the kind of detailed neurocomputational model that we have developed here should be of value to systems clinical neuroscience of acquired and developmental language disorders as well as to the refinement of current AI models for speech recognition and natural language interpretation – as explored by the Wingfield et al (2017;2022) and Lyu et al (2021) papers from the current project.
fig-1-processing-domains-supporting-incremental-speech-comprehension-and-the-dependencies-between-them.png