Skip to main content
European Commission logo print header
Content archived on 2022-12-23

SPOKEN QUERIES IN EUROPEAN LANGUAGES

Exploitable results

The system for spoken queries in the Slovenian language over the telephone consists of four basic modules: the word recognition module, the linguistic analysis module, the dialogue manager and the speech synthesis module. The application domains of the experimental system are air-flight information, so the system uses the Adria Airways flight timetable. The system provides the users with flight timetable information over the telephone. The word recognition module uses acoustic models of Slovenian polyphones and bigram and trigram stochastic language models. For linguistic analysis of the recognized sequences of words, a DCG based parser is implemented. The parser extracts the most important words needed for a database query and transforms it into a form, which is read by the dialogue manager. The task of the dialogue module is to interpret the meaning of the input utterance and to produce an appropriate answer, or to initiate a clarifying question. The answers, which are generated by the dialogue manager are synthesised using the speech synthesis module. The speech synthesis module involves grapheme-to-phoneme transcription and prosodic modelling. Basic speech units, diphones, are concatenated using the TD-PSOLA technique. The telephone based system is so designed that it can be easily adapted to a number of different speech dialogues and application databases.
A research project was set up to develop a multilingual multifunctional information retrieval system. The system is implemented in a 2 stage approach: first a system is trained with the read speech using the special training program. The stochastic language models of Slavic languages, such as Czech, Slovak and Slovenian, for spontaneous speech were created with the help of the large corpus of 10 000 training sentences (for each language). The national recognizers produce N-best word sequences rescored using a polygram language model as an input to the second processing step (linguistic analysis). The Czech recognizer was evaluated on microphone quality of speech and a word accuracy of 74% to 86% for speaker independent recognition was achieved. The linguistic analysis of the user utterances is realized with a language independent approach (keyword classification trees) and language dependent substring parsers. The dialogue manager, or dialogue module, interprets the meaning of the input utterance and produces an appropriate answer or a clarifying question. The input to the dialogue module is the semantic interpretation of the utterance represented in SIL, produced by the linguistic module. In the dialogue interpretation process the semantic interpretations and the dialogue model are matched, deciding the subsequent steps of cooperative user interaction in the structured dialogue model. Using this partitioned interactional model, dialogue management is partly independent of the language and the information service domain. Significant outcomes of the project are domain dependent speech databases (eg the database DOVLAS for the Czech language) in digital form for each of the languages. Since the transliteration and standard pronunciation of each utterance as well as an automatically derived time alignment are available as well, the data can serve as the basis for bootstrapping a recognition module for any other application in these 3 languages. These data are made available to the European Speech Community via compact disc read only memory (CDROM).

Searching for OpenAIRE data...

There was an error trying to search data from OpenAIRE

No results available