CORDIS - Resultados de investigaciones de la UE
CORDIS

Edinburgh Speech Science and Technology

Final Activity Report Summary - EDSST (Edinburgh speech science and technology)

Scientists study speech through speech science and speech technology. In speech science, an objective is to find out how speech sounds are produced in different contexts by different people - phonetics. We also want to discover why people mispronounce words or have trouble speaking, and develop therapies to help them - speech therapy. Speech technology is concerned with the development of computer programs to analyse, generate, and recognise speech. Although these disciplines are concerned with the same subject, speech, people working in each discipline often do not interact with each other, even though each strand of work stands to benefit substantially from the others.

In the Marie Curie Early Stage Training programme EDSST (Edinburgh Speech Science and Technology), we set out to create a multidisciplinary environment where early stage researchers (fellows) had the opportunity to train with experts from speech technology and speech science. Speech technology expertise included speech synthesis, speech recognition, machine learning and multimodal interaction. Speech science expertise in the project included articulatory and instrumental phonetics, clinical phonetics, and speech therapy. EDSST included five long-term fellows who studied for PhDs during their time in Edinburgh and five short-term fellows who were pursuing PhD studies at other institutions. Each of the Fellows worked with supervisors from both speech science and speech technology.

Advances in speech synthesis included:
(1) the use of models of speech and voice production to make speech synthesis more flexible, which makes it easier for computers to project different, appropriate personalities and emotions;
(2) ways of analysing articulation data to drive speech synthesisers that reproduce speech production processes; and
(3) ways of making computer speech sound more natural and conversational.

Advances in articulatory modelling included new ways of analysing articulatory data collected by technology such as ultrasound and electropalatography which has important implications for the study of speech production and for speech therapy, and studies concerned with variation in speech production and with the nature of articulation problems in people with childhood apraxia of speech, a poorly understood speech production disorder.

Advances in recognition and dialogue included new algorithms for searching for specific expressions in large multimedia databases, and new approaches to spoken interaction with computers which are less cognitively demanding than current voice interfaces.

Our 10 Fellows published a total of 34 scientific papers, 30 of which were peer-reviewed. All papers were published in reputable outlets that are frequently cited in the relevant literature. Long-term fellows attended an average of two conferences / workshops per year and gave an average of two talks at internal, national, and international meetings. Four of the five long-term fellows went on secondments to companies and well-known international research institutes. Three fellows conducted research that will be fed back into Festival, the major cross-platform open source speech synthesis system. As a result, our Fellows have an excellent international network of contacts. Much of the training was delivered through hands-on mentoring. Fellows also profited from the wide range of relevant courses and complementary training offered at QMU and UEDIN.

Finally, the EdSST fellows have interacted closely with other universities and research institutions, which has resulted in a number of long-term collaborations, and there have been a number of synergies with other collaborative projects such as the Marie Curie ITN SCALE, the FP7 STREP EMIME, the UK EPSRC project MultiMemoHome, and the Scottish Funding Council project MATCH.