Skip to main content
European Commission logo
English English
CORDIS - EU research results
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary
Content archived on 2024-05-24

Speech Driven Interfaces for Consumer Applications

Objective

The project promotes the integration of speech recognition technology into consumer devices for many languages and acoustic environments. Examples of such devices are mobile telephones, TV-control sets and car navigation kits. The transfer of the speech driven interfaces from one to various other languages and different acoustic environments is accomplished via language and application specific speech databases, which are used to train the acoustic models of the recognisers. Speech databases covering 18 languages and typical acoustic environments of the application areas are developed. Due to the diversity of acoustic environments, the speech databases alone cannot cover all these conditions sufficiently. In order to avoid degradation in recognition performance, therefore, methods are investigated which allow the adaptation of the speech databases to the specific acoustic environment in which the device is used.

Objectives:
The overall goal of SPEECON is to enable the project partners to transfer their speech recognition technology to interfaces for consumer devices such as mobile telephones, TV-control and home appliances, for many languages and acoustic environments. For transferring the recognisers, a set of speech databases will be created covering 18 languages and different acoustic environments as found at home, office and public places. In order to broaden the range of applications, specific speech database adaptation techniques are developed. The feasibility of the transfer approach is shown by three demonstrators.

Work description:
An analysis of the market for voice driven consumer devices will be made which will lead to the requirements for the functionality of the recognisers needed for market relevant applications. These requirements determine the specification of the speech databases to be created. The specification leads to the definition of the recording platforms, the acoustic environments for recording, the corpus, the specification of annotation and specification of speakers to be recorded.

Eighteen speech databases will be created by the steps:
- building and testing the recording platform
- recruitment and recording of speakers
- annotation of the recorded items.
The databases will be assessed by an external validation centre in order to ensure the necessary quality of the databases.

Parallel to the production of databases, algorithmic methods are investigated which allow the transfer of a given database from one acoustic environment to another. Based on the market analysis mentioned above, specific acoustic environments are investigated. This research should give an early input to the project in order to avoid the production of databases, which could be produced with less effort by algorithmic methods.

The feasibility of the transfer will be demonstrated by three consumer applications, which work for different languages and application-specific acoustic environments.


Milestones:
- Market analysis and specification of databases
- Exploration of potential of algorithmic transfer to specific environments
- Validated speech databases for 18 languages
- Tools for environmental adaptation developed and evaluated
- 3 consumer applications (demonstrators)

Fields of science (EuroSciVoc)

CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques.

You need to log in or register to use this function

Call for proposal

Data not available

Coordinator

SIEMENS AKTIENGESELLSCHAFT
EU contribution
No data
Address
WITTELSBACHERPLATZ 2
80333 MUENCHEN
Germany

See on map

Total cost
No data

Participants (9)