European Commission logo
polski polski
CORDIS - Wyniki badań wspieranych przez UE
CORDIS

Natural speech comprehension: Comprehension of speech in noise

Final Report Summary - SPIN (Natural speech comprehension: Comprehension of speech in noise)

The mastery of speech and the ability to translate our mental images into an acoustic message that we can share with our counterparts is central to what renders us human, and the understanding of the neurocognitive mechanisms underpinning speech comprehension is among the greatest open questions in science. Before we began the SpiN project (speech-in-noise comprehension: natural speech comprehension), speech comprehension was mostly studied in quiet environments. Although it is easy to understand speech delivered by headphones in a soundproof laboratory booth, in natural listening conditions, speech sounds always occur with a certain amount of background noise, making the listening task much more complex. Moreover, background noise is the primary and principal problem experienced by people facing hearing loss or auditory processing disorders. It therefore appeared necessary to develop research projects exploring speech-in-noise comprehension, considered the “natural” manner of perceiving speech. The second main concern at the origin of the SpiN project was to propose an interdisciplinary approach to speech-in-noise comprehension to circumvent the traditional limits encountered in disciplinary works mostly conducted either in acoustics and psychoacoustics or in experimental psycholinguistics. The SpiN project first rendered it possible to establish an interdisciplinary research team comprising researchers, postdoctoral researchers and doctorate students with backgrounds in humanities (linguistics, experimental psycholinguistics), life-science (cognitive neuroscience), psychoacoustics and acoustic-signal engineering, all focusing on the study of speech-in-noise.
The SpiN team conducted experimental studies on speech-in-noise comprehension that were initially concerned with testing the intelligibility edges by defining speech intelligibility boundaries. From a speech perspective, we ended up with a scale of phoneme resistance to noise in which high-energy sibilants are the more resistant speech sounds and low energy fricatives are the less resistant, highlighting the particular role of strong coherent energy structures such as formants and sibilants in noisy situations. We also proposed confusion matrices for French. Moreover, we explored the influence of the nature of background noise and of the listening situations considered. Using different languages in the background, for example, we showed that known languages are the most harmful background noise and that different unknown languages produce different masking effects depending on their linguistic and acoustic properties. Our studies revealed that linguistic characteristics such as lexical and phonetic information are the most disturbing but benefit from the effect of the binaural release of masking whenever it is possible to use spatial location information to decrease masking effects.
We also demonstrated that when increasing the difficulty of extracting target-signals, interferences occurring at the perceptual stages decrease the number of resources available for deeper semantic processing. Semantic activation of the babble is observed only when intelligibility is sufficient (few talkers in the babble), highlighting the non-automaticity of semantic activation and suggesting shared neural resources between listening and understanding. This result is crucial because it shows how noise is deleterious and can affect higher levels of cognition in short and long terms. Turning to a specific population with language impairment such as dyslexics diagnosed solely on the basis of their reading deficit, it appears that they also experience difficulties in processing speech. This difficulty has often been ignored because it is hardly observable in favourable listening conditions, i.e. speech-in-quiet; however, it becomes quite reproducible as soon as listening conditions become challenging, i.e. speech-in-noise and particularly speech-in-speech, potentially relating to phonological accounts of dyslexia. However, we demonstrated that adult dyslexics were able to rely on de-noising strategies based on spatial unmasking. This compensation effect is linked to a hyper-activation of specific brain areas (i.e. right STS/STG) and a voxel-based morphometry analysis (VBM), showing the presence of grey matter distribution abnormalities in this region in adults with dyslexia.
To conduct these studies and further understand the links between language capacities and speech-in-noise processing, we developed different new methodologies. For example, we established an adult screening for the diagnosis of central auditory capacities that disentangled speech processing and auditory capacities as much as possible. This tool must be validated on a large population, and normative data should be gathered. Moreover, to explore more precisely which auditory primitives are used during speech perception, the SpiN project made significant progress towards the development of Auditory Classification Images, a renewed experimental method that as yet has worked only for vision and promises to be quite fruitful and fill a paradigmatic gap in speech research.