SPEECHDAT aims at providing basic speech resources to spur research and technological development in automated services, including mobile and fixed telephone communication. Such resources are especially required to develop, train and test robust speech recognisers and verification. The SPEECHDAT databases will cover all 11 EU official languages plus Norwegian, Slovenian, and Welsh, addressing numerous application parameters, speaking styles and environmental influences. They will be geared to recognition tasks over fixed and mobile telephone networks, as well as training and testing of telephone-based verification systems.
During the first year of the project all necessary specifications for the speech data collection over fixed and mobile telephone networks, including speaker verification, have been finished. Most of the hardware platforms, i.e. telephone servers, were installed, and some of them were successfully validated based on a collection from 10 sample speakers.
The project is now well prepared to start the actual collection of speech data. The databases will cover all eleven official languages of the European Union as well as Norwegian, Slovenian, Welsh, and specific variants of Dutch, French, German and Swedish. The databases will cover a wide range of applications (application-oriented words, phonetically rich sentences, spontaneous utterances), speaking styles (commands, carefully pronounced and spontaneous speech) and environmental influences (mobile and fixed telephone networks).
Three major types of database are to be created. For each official EU language a 5,000 speaker database, recorded over the fixed telephone network, for training and testing speech recognisers. For selected languages a 1,000 speaker database, recorded over mobile telephone networks will be built. A speaker verification database will be built for some languages, with multiple calls by a small number of speakers for training and testing of verification systems over the telephone network.
These resources will be suitable for developing and training robust speech recognisers and developing and testing robust speech verification.
Establishing uniform and high quality tested platforms at each site represented a major effort. The detailed specification of the speech databases has been finished and agreed by all members of the consortium. This will ensure the usefulness of the speech databases for a wide range mono- and multlingual applications as well as by other potential end-users. The consortium has established close co-operation with ELRA concerning the future promotion and distribution of the SpeechDat databases.
The Way Ahead
The consortium represents many major European players in the field of developing voice driven teleservices. They will actually use the created databases for building their own teleservices.
The major steps in the next year concerning the creation of the speech databases will be:
- installing and validation of the remaining hardware platforms
- recording and annotation of the speech databases
- validation of the speech databases.
The SpeechDat database products and demonstrators of teleservices based on SpeechDat will be reported and exhibited at international conferences and trade shows, such as EuroSpeech, ICASSP, Voice and Cebit.
Funding SchemeCSC - Cost-sharing contracts
CV3 1HJ Coventry
78392 Bois D'arcy
2264 XZ Leidschendam
CB2 5LD Cambridge