Skip to main content

Predicting speech: what and when does the brain predict during language comprehension?

Periodic Reporting for period 1 - PreSpeech (Predicting speech: what and when does the brain predict during language comprehension?)

Reporting period: 2018-09-10 to 2020-09-09

We rely on spoken communication every day. Our speed and accuracy of language comprehension is remarkable. For an individual the breakdown or impediment of either of these two components results in life-changing disabilities. Understanding cognitive mechanisms of speech and accuracy in word recognition and how they are enhanced by context is key to improving language disorder diagnoses and treatment techniques. Despite this there is little understanding of how such fluency is achieved. New cognitive neuroscience theories propose that the human brain achieves speech processing by using context and world knowledge to allow cortical circuits to probabilistically predict and pre-activate upcoming sounds, words and even larger phrases. Many questions regarding this promising approach, however, remain unclear. The most important are: (a) what are the neurobiological mechanisms of predictive processing of speech? and (b) what is their impact on natural speech comprehension in populations with speech, language or reading disorders? The main scientific objective of PreSpeech was to address these questions in the context of spoken sentence processing in typical and dyslexic readers.

The reason to focus on the dyslexic population is the potential of predictive processing to be a compensation strategy for phonological deficits in dyslexia. Dyslexic readers, compared to normal readers, have impaired cortical entrainment to low frequency auditory speech features (words’ envelopes) and this may be the critical element in their phonological deficit. Using context to generate low-level word-form and higher-level semantic predictions about upcoming words can reduce the burden on the bottom-up analysis of the input and reliance on the entrainment to the prosodic speech contours. This can constitute a compensation strategy for aspects of speech processing in dyslexia. To achieve our goals we assembled a state-of-the-art analysis pipeline that included oscillatory and multivariate techniques applied to time-resolved measures of brain activity. Below we outline the main to-date findings of this project.
For the PreSpeech project MEG (magnetoencephalography) data was collected and pre-processed (de-noised, filtered and cleaned of artifacts) for 43 participants. In addition for each of them MRI structural scans as well as behavioural measures of IQ, reading and phonological processing ability have been collected. 25 of these participants were typical and 18 were dyslexic readers (self-reported or diagnosed). All participants listened to naturalistic Spanish sentences which varied in contextual constraints. Furthermore, to explore how predictive processing is affected by situations that occur in natural environments and to make speech processing more difficult we included temporally jittered speech conditions. This was done by randomly compressing and expanding speech audio at different compression rates. It was designed to simulate situations where one or several speakers in a live conversation change speech speed and therefore require the listener to dynamically adjust their speech sampling to keep up with comprehension. Since dyslexics have issues with adaptive entrainment to speech we predicted that they will also have specific difficulty with such stimuli.

To make sure spoken sentences were natural and close to spontaneous speech we built a neural network to select stimuli from large corpora, in collaboration with the Computer Science faculty in the University of the Basque Country. We then analysed brain activity recorded with magnetoencephalography during sentence listening using evoked responses, speech-to-brain synchronization and representational similarity analysis. This resulted in a rich dataset where we can explore the effects of predictability and temporal noise on natural speech in typical and atypical readers.

To-date the typical readers (control group) dataset has been analysed extensively and the analysis of the dytslexic group is ongoing and showing promising results. Our main key finding is that the speech areas optimise their processing strategy to both the perceptual and linguistic properties of the speech stimulus. Specifically we found that speech parsing at the syllabic level as indexed by speech-to-brain extrainemt in the theta band (6.5-8 Hz, Figure 1A) was reduced in the temporally jittered speech. Concurrently for the jittered speech there was a smaller effect of top-down semantic predictions (Figure 1B representational similarity analysis RSA) in the left-frontotemporal sensors. Together this shows that the ability to parse speech efficiently through speech-to-brain entrainment to quasi-periodic syllabic elements is critical for enabling more efficient higher-order semantic prections. However, when speech was both contextually and temporally predictable the cortical tracking in the delta band (0.5 Hz, Figure 1A) associated with word and phrase tracking was reduced. This implies that for normal unmodulated speech predictability reduced the level of lexico-syntactic tracking.

Dyslexic data analysis is underway. We compared sentence processing strategies in age- and non-verbal IQ matched groups of dyslexics and controls. Behaviourally dyslexic participants are predictably worse on measures of phonological processing. The neuroimaging analysis further showed that dyslexic cortical entrainment in the theta range related to syllabic parsing was worse than the controls. Furthermore dyslexics also did not show reduction of delta band entrainment for contextually and temporally predictable speech. This suggests that unlike the controls, predictability did not reduce the efforts of higher-order lexico-syntactic parsing. At the same time, however, dyslexics show similar effects of predictive context processing as indexed by well-explored N400 effects - a reduction of signal amplitude for more predictable words. Overall, this preliminary analysis suggests that while lower-level linguistic processes in poor and dyslexic readers are affected, higher-level contextual analysis remains intact. The ongoing work on the dataset seeks to explore these effects in greater detail.
Key findings presented so far for control and emerging results for dyslexics participants expanded our understanding of the predictive processing of speech within the cortical networks. By combining oscillatory and multivariate techniques for data analysis with neural network methods for stimuli selection we showed that language processing is naturally flexible and this should have an impact on our understanding of language recovery and design of the remediation strategies that make use of this inherent flexibility.

The PreSpeech project has also served as a stepping stone for new collaborative cross-disciplinary research. Together with colleagues from the Computer Science faculty in the University of the Basque Country we are using data collected in this project to design neural network tools for dyslexia diagnosis from MEG data. We expect this side of the project to have a significant impact on the future tools of diagnostics informed by neuroimaging and in future more personalised medical assessment.
prespeech-image.png