Speech Recognizer Quality Assessement for Linguistic Engineering

Informazioni relative al progetto

SQALE

ID dell’accordo di sovvenzione: LRE62058

Progetto chiuso

Data di avvio 1 Dicembre 1993

Data di completamento 1 Giugno 1995

Finanziato da

Specific programme of research and technological development (EEC) in the field of telematic systems in areas of general interest - Linguistic research and engineering -, 1990-1994

Costo totale

Nessun dato

Contributo UE

Nessun dato

Coordinato da

TNO Institute for Human Factors
Netherlands

Obiettivo

The project aims at developing an assessement paradigm for large vocabulary, speaker independent, continuous speech recognition in Europe, taking into account the distinctive characteristics of a multilingual environment and identifying the problems it raises . Also, the project will begin the definition of guidelines for future assessement actions. If the SQALE project proves successful, these guidelines could be extended to an evaluation paradigm for future large scale European language/speech programs.

The project will also directly contribute to the assessement and evaluation of NLP systems in at least three ways:

- a general framework will be established for comparing machine generated output with reference corpora;
- a first step will be taken toward handling real-word phenomena, such as false starts and hesitations; - the effects of differing test set perplexities across various European languages will be quantified.

The SQALE experiment will therefore not only extend European standards in speech recognition assessment (which are limited to isolated and connected word systems, without a direct link with language models) but also initiate the necessary and much awaited integration between speech and NL assessment methodologies.

Approach and Methods

The project takes into account the experience gained by the partners in the 1992 DARPA RM and WSJ evaluations in order to investigate how the US protocols can be improved and extended into a multilingual experimental design, as required for a European approach.

The basic idea of the project is to form a small consortium, made up of a coordinating laboratory - having a high technical expertise in the field - and three other laboratories testing their "in house" recognition systems. The "testing" laboratories are located in three different countries, where three different languages are spoken. Hence two dimensions of the research paradigm are investigated: the recognition algorithms (at least 3) and the languages (at least 3). In particular, the experiment will focus on two independent research questions:

the merits of different recognition algorithms applied to the same data, and
the relative difficulties in speech recognition across different languages.

Having multiple sites applying their algorithms on the same database makes it possible to discuss the merits of different methods on the same data. Testing the same algorithm on different databases in different languages will reveal the relative difficulties of speech recognition for different languages, and the degree of robustness of the algorithm with respect to a given language.

Each testing laboratory will be responsible for providing data in its own language - both written and spoken corpora - for assessing its systems according to a commonly accepted protocol and for performing the assessement procedure for English and at least one other language. The coordinator will organize the assessement experiment and will be responsible for timely distribution of training and test materials, and for gathering of the tests results. He will also score the recognizers output and analyze the results.

The high quality of the three test sites and their recognition systems (all three labs proved to perform at the top level in the DARPA 92 bench mark test) and the high technical standard of the coordinating laboratory are considered essential ingredients for the success of the project. .

Exploitation and Future Prospects

SQALE intends to bridge the gap between the state-of-the-art in commercial systems assessment - as examplified by the SAM Esprit project - and the state-of-the-art in research systems assessement - as represented by ARPA. It will therefore have a direct relevance to current leading edge research and development, and should also have a pull through effect on future application-driven and technology-driven research. Furthermore, SQALE will operate in a multilingual European context and will therefore go beyond the current ARPA scope. Cross-language assessment and evaluation have never been performed on this scale previously: SQALE will be a pioneer project in this respect.

As far as more immediate and practical results of the project are concerned, the dissemination of the following material is envisaged (through EAGLES):

the speech corpora, including the speech signal and the associated transcription, the lexica and the text corpora;
the results obtained by each testing participant in its own language and in the common language;
the guidelines and recommendations on how to conduct and organize systems evaluation in a multinational, multilingual context.

These results will constitute a baseline from which it will be possible to improve the methodology, enlarge the number of participants, augment the difficulty of the tasks and ensure the coordination with closely related research areas, such as written language processing and machine translation.

A primary basis for interaction between speech and NL systems will be represented in fact in the near future by the common use of text corpora and statistically based language models. The development of common assessement methodologies and protocols will be equally relevant for NL and speech integration.

Campo scientifico (EuroSciVoc)

CORDIS classifica i progetti con EuroSciVoc, una tassonomia multilingue dei campi scientifici, attraverso un processo semi-automatico basato su tecniche NLP. Cfr.: Il Vocabolario Scientifico Europeo.

scienze naturali informatica e scienze dell'informazione basi di dati

Programma(i)

Programmi di finanziamento pluriennali che definiscono le priorità dell’UE in materia di ricerca e innovazione.

FP3-LRE - Specific programme of research and technological development (EEC) in the field of telematic systems in areas of general interest - Linguistic research and engineering -, 1990-1994

Argomento(i)

Gli inviti a presentare proposte sono suddivisi per argomenti. Un argomento definisce un’area o un tema specifico per il quale i candidati possono presentare proposte. La descrizione di un argomento comprende il suo ambito specifico e l’impatto previsto del progetto finanziato.

Dati non disponibili

Invito a presentare proposte

Procedura per invitare i candidati a presentare proposte di progetti, con l’obiettivo di ricevere finanziamenti dall’UE.

Dati non disponibili

Meccanismo di finanziamento

Meccanismo di finanziamento (o «Tipo di azione») all’interno di un programma con caratteristiche comuni. Specifica: l’ambito di ciò che viene finanziato; il tasso di rimborso; i criteri di valutazione specifici per qualificarsi per il finanziamento; l’uso di forme semplificate di costi come gli importi forfettari.

Dati non disponibili

Coordinatore

TNO Institute for Human Factors

Contributo UE

Nessun dato

Indirizzo

Kampweg 5, PO Box 23
3769 ZG Soesterberg
Paesi Bassi

Costo totale

Nessun dato

Partecipanti (3)

Centre National de la Recherche Scientifique (CNRS)

Francia

Contributo UE

Nessun dato

Indirizzo

Paris

Costo totale

Nessun dato

Philips GmbH

Germania

Contributo UE

Nessun dato

Indirizzo

Weisshausstraße 2
52066 Aachen

Costo totale

Nessun dato

University of Cambridge

Regno Unito

Contributo UE

Nessun dato

Indirizzo

Trumpington Street
CB2 1PZ Cambridge

Costo totale

Nessun dato

Obiettivo

Campo scientifico (EuroSciVoc)

CORDIS classifica i progetti con EuroSciVoc, una tassonomia multilingue dei campi scientifici, attraverso un processo semi-automatico basato su tecniche NLP. Cfr.: Il Vocabolario Scientifico Europeo.

Programma(i)

Programmi di finanziamento pluriennali che definiscono le priorità dell’UE in materia di ricerca e innovazione.

Argomento(i)

Gli inviti a presentare proposte sono suddivisi per argomenti. Un argomento definisce un’area o un tema specifico per il quale i candidati possono presentare proposte. La descrizione di un argomento comprende il suo ambito specifico e l’impatto previsto del progetto finanziato.

Invito a presentare proposte

Procedura per invitare i candidati a presentare proposte di progetti, con l’obiettivo di ricevere finanziamenti dall’UE.

Coordinatore

Partecipanti (3)

Condividi questa pagina Condividi questa pagina sui social network

Scarica Scarica il contenuto della pagina

Speech Recognizer Quality Assessement for Linguistic Engineering

Obiettivo

Campo scientifico (EuroSciVoc) CORDIS classifica i progetti con EuroSciVoc, una tassonomia multilingue dei campi scientifici, attraverso un processo semi-automatico basato su tecniche NLP. Cfr.: Il Vocabolario Scientifico Europeo.

Programma(i) Programmi di finanziamento pluriennali che definiscono le priorità dell’UE in materia di ricerca e innovazione.

Argomento(i) Gli inviti a presentare proposte sono suddivisi per argomenti. Un argomento definisce un’area o un tema specifico per il quale i candidati possono presentare proposte. La descrizione di un argomento comprende il suo ambito specifico e l’impatto previsto del progetto finanziato.

Invito a presentare proposte Procedura per invitare i candidati a presentare proposte di progetti, con l’obiettivo di ricevere finanziamenti dall’UE.

Coordinatore

Partecipanti (3)

Condividi questa pagina Condividi questa pagina sui social network

Scarica Scarica il contenuto della pagina

Campo scientifico (EuroSciVoc)

CORDIS classifica i progetti con EuroSciVoc, una tassonomia multilingue dei campi scientifici, attraverso un processo semi-automatico basato su tecniche NLP. Cfr.: Il Vocabolario Scientifico Europeo.

Programma(i)

Programmi di finanziamento pluriennali che definiscono le priorità dell’UE in materia di ricerca e innovazione.

Argomento(i)

Gli inviti a presentare proposte sono suddivisi per argomenti. Un argomento definisce un’area o un tema specifico per il quale i candidati possono presentare proposte. La descrizione di un argomento comprende il suo ambito specifico e l’impatto previsto del progetto finanziato.

Invito a presentare proposte

Procedura per invitare i candidati a presentare proposte di progetti, con l’obiettivo di ricevere finanziamenti dall’UE.