Robust Analytical Speech Recognition System


The goal of the ROARS project is to increase the robustness of an existing analytical speech recognition system (ie one using statistical knowledge about syllables, phonemes and phonetic features), and to use it as part of a speech understanding system with connected words and dialogue capability for two languages: French and Spanish.
Two identical hardware prototypes were built for speech analysis, the statistical knowledge necessary for the feature based approach has been established for Spanish and improved for French and, finally, the corresponding speech recognition systems were implemented. In this way acceptable recognition rates in a multispeaker environment were successfully demonstrated for the French recognition system and for a 100 word vocabulary (work is still in progress for the Spanish system).

Simultaneously, a progressive and slow adaptation for vowel recognition was investigated and the Lombard effect was studied both in French and Spanish.

Concerning the improvement of voice input robustness by use of dialogue and understanding, a set of tools was developed (editors, compilers, etc) in the domain of 'finalized dialogue', that is to say task oriented and cooperative. The implementation of the 2 demonstrations is in progress.
The work started from an existing system implemented for the French language. This system has been shown to operate in real time, to be speaker-independent, and has had satisfactory results with continuously uttered connected words.

The aim of the first phase of the project was to develop and implement the corresponding knowledge-bases for the Spanish language and to enhance for both languages the robustness of this system against:

- Intra- and inter-speaker changes in articulation, by the improvement of statistical knowledge used in the system.
- Various ambient noises, by analysing the degradations induced on each acoustic cue used in the phonemic recognition system and the changes in articulation (at the feature level) when the speaker is under different noise conditions and by studying and testing improvements aiming to minimise these degradations.

All these tasks are run in parallel for both languages, French and Spanish. In order to study, implement and test improvements, two identical hardware prototypes were built (one for the French application and one for the Spanish).

The aim of the second phase was the implementation of two demonstrations of speech understanding for air traffic control (one in French, one in Spanish) and will be the integration of voice input with other devices (such as keyboards, tracker-balls and screens).


