Linguistic Analysis of the European Languages

Informacje na temat projektu

LING-ANALYSIS

Identyfikator umowy o grant: 291

Projekt został zamknięty

Data rozpoczęcia 1 Lutego 1985

Data zakończenia 1 Kwietnia 1989

Finansowanie w ramach

European programme (EEC) for research and development in information technologies (ESPRIT), 1984-1988

Koszt całkowity

Brak danych

Wkład UE

Brak danych

Koordynowany przez

Ingegneria C. Olivetti and C. SpA
Italy

Cel

The LING-ANALYSIS project produced the software necessary to perform grapheme-to-phoneme and phoneme-to-grapheme conversion at word level. This involved conversions between the textual and acoustical representation of words and the acquisition of the knowledge required to include speech in the man-machine interface. A linguistic model, based on typical syntactic patterns extracted from texts by statistical analyses, has been developed to deal with ambiguous solutions. The project covered the following languages: Dutch, English, French, German, Greek, Italian and Spanish. The first step was the development of a common methodology among the different languages in order to provide coherent and comparable results. Hardware and software tools were standardisedamong the partners, and language-specific tools developed where necessary. Reference corpora of about 200000 words plus dictionaries and lists of ambiguities (homographs and homophones) were extracted from common European Community texts and newspapers.The efinition and development of a linguistic model for the semi-automatic labelling of new text corpora and for phoneme-to-grapheme conversion on the basis of a contextual analysis was achieved. ed.
The project produced the software necessary to perform grapheme to phoneme and phoneme to grapheme conversion at word level. This involved conversions between the textual and acoustical representation of words and the acquisition of the knowledge required to include speech in the man machine interface. A linguistic model, based on typical syntactic patterns extracted from texts by statistical analyses, has been developed to deal with ambiguous solutions. The project covered the following languages: Dutch, English, French, German, Greek, Italian and Spanish. The first step was the development of a common methodology among the different languages in order to provide coherent and comparable results. Hardware and software tools were standardized among the partners, and language specific tools developed where necessary. Reference corpora of about 200 000 words plus dictionaries and lists of ambiguities (homographs and homophones) were extracted from common European Community texts and newspapers. The definition and development of a linguistic model for the semiautomatic labelling of new text corpora and for phoneme to grapheme conversion on the basis of a contextual analysis was achieved.
The following results are now available for the different languages:
Conversion Algorithm: word level grapheme-to-phoneme and phoneme-to-grapheme conversion algorithms.
Analysis of Language at Word Level
-computer-readable common phonemic alphabet
-consistent systems of grammatical classes
-labelling of text corpora of a few thousand words
-dictionaries, extracted from the corpora, providing (for each word) graphemic and phonemic representations, possible grammatical tabs, and usage frequency
-statistics, extracted from the dictionaries, providing: phonemes and phoneme cluster frequency; graphemes and grapheme cluster frequency; word distribution based on the grapheme length and on the length with or without frequency weighting; and the set f unction K(n) providing K, the percentage coverage of the corpora obtained with the n most frequent words.
Disambiguation Rules for Phoneme-to-Grapheme Conversion
-list of ambiguous words and ambiguity frequency estimates regarding the grapheme/phoneme/grapheme conversions
-transition matrices providing the observed frequency of any pair or triplet of grammatical classes.
Assessment of Conversions: methodologies for evaluating the statistical validity of the information appearing in the transition matrices and for comparing the expected performance in speech recognition of different class systems.
Integration in a Practical Conversion System: a blackboard model of the language that uses the available knowledge on contextual constraints for solving the ambiguities consequent to the phoneme to grapheme conversion and for selecting the most likely sentence from a word lattice.
Exploitation
Full industrial exploitation of the results is expected in the early 1990s in speech processing based systems. Target application areas are unrestricted texts, speech synthesis, and large vocabulary speech recognition. The acquired knowledge and the results obtained will also be useful for applications in other domains, such as optical scanning, word-processing and automatic translation.

Dziedzina nauki (EuroSciVoc)

Klasyfikacja projektów w serwisie CORDIS opiera się na wielojęzycznej taksonomii EuroSciVoc, obejmującej wszystkie dziedziny nauki, w oparciu o półautomatyczny proces bazujący na technikach przetwarzania języka naturalnego. Więcej informacji: Europejski Słownik Naukowy.

nauki przyrodnicze informatyka oprogramowanie

Program(-y)

Wieloletnie programy finansowania, które określają priorytety Unii Europejskiej w obszarach badań naukowych i innowacji.

FP1-ESPRIT 1 - European programme (EEC) for research and development in information technologies (ESPRIT), 1984-1988

Temat(-y)

Zaproszenia do składania wniosków dzielą się na tematy. Każdy temat określa wybrany obszar lub wybrane zagadnienie, których powinny dotyczyć wnioski składane przez wnioskodawców. Opis tematu obejmuje jego szczegółowy zakres i oczekiwane oddziaływanie finansowanego projektu.

Brak dostępnych danych

Zaproszenie do składania wniosków

Procedura zapraszania wnioskodawców do składania wniosków projektowych w celu uzyskania finansowania ze środków Unii Europejskiej.

Brak dostępnych danych

System finansowania

Program finansowania (lub „rodzaj działania”) realizowany w ramach programu o wspólnych cechach. Określa zakres finansowania, stawkę zwrotu kosztów, szczegółowe kryteria oceny kwalifikowalności kosztów w celu ich finansowania oraz stosowanie uproszczonych form rozliczania kosztów, takich jak rozliczanie ryczałtowe.

Brak dostępnych danych

Koordynator

Ingegneria C. Olivetti and C. SpA

Wkład UE

Brak danych

Adres

Corso Svizzera 185
10149 Torino
Włochy

Koszt całkowity

Brak danych

Uczestnicy (7)

Acorn Computers Ltd

Zjednoczone Królestwo

Wkład UE

Brak danych

Adres

Acorn House Vision Park Histon
CB4 4AE Cambridge

Koszt całkowity

Brak danych

Centre National de la Recherche Scientifique (CNRS)

Francja

Wkład UE

Brak danych

Adres

91406 Orsay

Koszt całkowity

Brak danych

KATHOLIEKE UNIVERSITEIT NIJMEGEN

Niderlandy

Wkład UE

Brak danych

Adres

ERASMUSPLEIN 1
6525 HT NIJMEGEN

Koszt całkowity

Brak danych

RUHR-UNIVERSITY BOCHUM

Niemcy

Wkład UE

Brak danych

Adres

Universitätsstraße 150
44780 BOCHUM

Koszt całkowity

Brak danych

Tecnopolis Csata Novus Ortus

Włochy

Wkład UE

Brak danych

Adres

Strada Provinciale per Casamassima Km 3.00
70010 Valenzano Bari

Koszt całkowity

Brak danych

UNIV NACIONAL DE EDUCACION A DISTANCIA

Hiszpania

Wkład UE

Brak danych

Adres

PABELLON GOBIERNO
X MADRID

Koszt całkowity

Brak danych

UNIV OF PATRAS

Grecja

Wkład UE

Brak danych

Adres

26500 PATRAI

Koszt całkowity

Brak danych

Cel

Program(-y)

Wieloletnie programy finansowania, które określają priorytety Unii Europejskiej w obszarach badań naukowych i innowacji.

Temat(-y)

Zaproszenia do składania wniosków dzielą się na tematy. Każdy temat określa wybrany obszar lub wybrane zagadnienie, których powinny dotyczyć wnioski składane przez wnioskodawców. Opis tematu obejmuje jego szczegółowy zakres i oczekiwane oddziaływanie finansowanego projektu.

Zaproszenie do składania wniosków

Procedura zapraszania wnioskodawców do składania wniosków projektowych w celu uzyskania finansowania ze środków Unii Europejskiej.

Koordynator

Uczestnicy (7)

Udostępnij tę stronę Udostępnij tę stronę w mediach społecznościowych

Pobierz Pobierz zawartość strony

Linguistic Analysis of the European Languages

Cel

Program(-y) Wieloletnie programy finansowania, które określają priorytety Unii Europejskiej w obszarach badań naukowych i innowacji.

Temat(-y) Zaproszenia do składania wniosków dzielą się na tematy. Każdy temat określa wybrany obszar lub wybrane zagadnienie, których powinny dotyczyć wnioski składane przez wnioskodawców. Opis tematu obejmuje jego szczegółowy zakres i oczekiwane oddziaływanie finansowanego projektu.

Zaproszenie do składania wniosków Procedura zapraszania wnioskodawców do składania wniosków projektowych w celu uzyskania finansowania ze środków Unii Europejskiej.

Koordynator

Uczestnicy (7)

Udostępnij tę stronę Udostępnij tę stronę w mediach społecznościowych

Pobierz Pobierz zawartość strony

Program(-y)

Wieloletnie programy finansowania, które określają priorytety Unii Europejskiej w obszarach badań naukowych i innowacji.

Temat(-y)

Zaproszenia do składania wniosków dzielą się na tematy. Każdy temat określa wybrany obszar lub wybrane zagadnienie, których powinny dotyczyć wnioski składane przez wnioskodawców. Opis tematu obejmuje jego szczegółowy zakres i oczekiwane oddziaływanie finansowanego projektu.

Zaproszenie do składania wniosków

Procedura zapraszania wnioskodawców do składania wniosków projektowych w celu uzyskania finansowania ze środków Unii Europejskiej.