Novel voice adaptation methods to facilitate multilingual communication in Europe

With rapid globalisation and the need for communication across multiple languages, attention has turned to the development of supporting tools and applications. An EU initiative contributed to advances in this area that will ultimately help people communicate more effectively.

Digital Economy

The EU-funded CLSASTS (Rapid cross-lingual speaker adaptation for statistical text-to-speech systems) project set out to refine personalised speech-to-speech applications. More specifically, it aimed at extending text-to-speech synthesis through new methods for statistical text-to-speech (STS) systems. Project work covered the development of state-of-the-art English and Turkish STS systems and their extensive quality and intelligibility testing. For the Turkish system, 10 hours of voice studio recordings were gathered from 3 professional voice artists. Pronunciation generation, text processing and syntactic analysis algorithms were created for the Turkish language. Test results showed the quality and intelligibility of the Turkish STS system as equal to that of its English equivalent. A novel hybrid statistical/unit selection speech synthesis system was developed that takes advantage of the morphological structure of the Turkish language. This system was found to have better speech quality than the baseline STS system, with a minimal need for increase in memory requirements. Collection of Turkish data from broadcast news and university students enabled the creation of a database of 70 male and 70 female Turkish speakers. In addition, the CLSASTS team developed eigenvoice-based speaker adaptation algorithms and a novel Bayesian eigenvoice technique. The latter, in combination with a nearest-neighbour approach, successfully demonstrated considerably better high speaker similarity. The nearest-neighbour algorithm performed as well as the single-nearest-neighbour method. What is more, non-linear dimensionality reduction methods did not enhance the performance over the baseline system. Given the large number of languages spoken in Europe, CLSASTS will have important socioeconomic implications, with improved communication between EU countries. By contributing to ongoing speech-to-speech translation efforts, it will give Europe a competitive edge. In addition, the technology will encourage new companies and/or commercial production.

Keywords

Discover other articles in the same domain of application

Big data goes big time

8 October 2021

Novel AI-based software for better musical inspiration, creativity and composition

29 August 2018

Novel statistical learning methods to better analyse Earth observation satellite data

14 April 2020

Project Information

CLSASTS

Grant agreement ID: 268409

Project closed

Start date 1 February 2011

End date 31 January 2015

Funded under

Specific programme "People" implementing the Seventh Framework Programme of the European Community for research, technological development and demonstration activities (2007 to 2013)

Total cost

€ 100 000,00

EU contribution

€ 100 000,00

100 000,00

Coordinated by

OZYEGIN UNIVERSITESI
Türkiye

Novel voice adaptation methods to facilitate multilingual communication in Europe

Keywords

Discover other articles in the same domain of application

Share this page Share this page on social networks

Download Download the content of the page