Servicio de Información Comunitario sobre Investigación y Desarrollo - CORDIS

FP6 EUROMATRIX logo

EuroMatrix

Project ID: 034291
Financiado con arreglo a

EuroMatrix: Statistical and Hybrid Machine Translation Between All European Languages

Desde 2006-09-01 hasta 2009-02-28, proyecto cerrado | EuroMatrix Sitio web

Detalles del proyecto

Coste total:

EUR 2 358 747

Aportación de la UE:

EUR 2 066 388

Coordinado en:

Germany

Convocatoria de propuestas:

FP6-2005-IST-5See other projects for this call

Régimen de financiación:

STREP - Specific Targeted Research Project

Description

Teaching computers to translate

An ambitious project to develop software capable of automatically translating between the European Union’s 23 official languages promises to help governments, businesses and citizens communicate more easily and cheaply.

Called Moses, the open source software is being developed by a team of researchers in the EUROMATRIX project. They aim to provide the EU’s private and public sectors with the means to translate documents quickly and accurately between any of the 27-nation bloc’s language pairs.

To achieve that goal, the researchers are relying on innovative technologies to take machine translation beyond the current state of the art.

Billions of euros spent on translations

Currently, European governments, European institutions, companies and citizens spend billions of euros a year employing an army of translators to translate legislation, business documents and legal papers between European languages.

The workload has more than doubled in recent years as the EU has gone from 15 Member States to 27 and from 11 official languages to 23.

Machine translation, in which computers rather than people do the work, has been an option for years, although its poor quality has generally made it unviable for anything but the most rudimentary uses.

As anyone who has ever attempted to translate text on the internet knows, the output of many automated systems is often littered with punctuation errors, misplaced words and grammatical mistakes that can make the text almost unintelligible.

Developing a viable alternative to human translators

To make machine translation a viable alternative to human translation, the EUROMATRIX researchers saw the need for a new approach. Instead of simply following the path of earlier systems, which translate texts using a set of predefined linguistic rules and a constrained lexicon, the EUROMATRIX team developed software to allow the computer to learn to translate from past work and experience.

In essence, the approach relies on the computer program referring to an existing body of translated text and using statistical analysis to determine how words are used.

The internet makes finding the necessarily vast body of existing translations easy, allowing the system to draw on texts in different languages to greatly improve the accuracy of translations over uniquely rule-based approaches.

The researchers have also investigated a hybrid approach, merging both statistical and rule-based systems.

Translation workshops and commercial interest

In order to demonstrate Moses, the EUROMATRIX team has organised a series of workshops involving universities across Europe, while the open source software itself has already elicited commercial interest.

Moses is currently being used by several big European organisations as well as some small and medium businesses, according to the project partners.

Objetivo

EUROMATRIX aims at a major push in machine translation technology applying themost advanced MT technologies systematically to all pairs of EUlanguages. Special attention will be paid to the languages of the new andnear-term prospective member states. As part of this application development,EUROMATRIX will design and investigate novel combinations of statisticaltechniques and linguistic knowledge sources as well as hybrid MT architectures.In contrast to research funded by DARPA and ARDA with focus on non-Europeanlanguages, EUROMATRIX will address urgent European economic and social needs byconcentrating on European languages and on high-quality translation to beemployed for the publication of technical, social, legal and politicaldocuments.EUROMATRIX will enrich the statistical MT approach with novel learningparadigms and experiment with new combinations of methods and resources fromstatistical MT, rule-based MT, shallow language processing and computationallexicography/morphology.EUROMATRIX has the following concrete objectives: Translation systems for allpairs of EU languages, with a special focus on the languages of new andnear-term prospective member states; Efficient inclusion of linguisticknowledge into statistical machine translation; The development and testing ofhybrid architectures for the integration of rule-based and statisticalapproaches; Organization, analysis and interpretation of a competitive annualinternational evaluation of machine translation with a strong focus on Europeaneconomic and social needs; The provision of open source machine translationtechnology including research tools, software and data; A systematicallycompiled and constantly updated detailed survey of the state of MT technologyfor all EU language pairs based on the developed systematic translation betweenall EU languages, the comparative MT evaluations and an inventory of availableand needed tools, components, lingware and data.

Contacto del coordinador

Hans Uszkoreit, (Coordinator)

Coordinador

UNIVERSITAET DES SAARLANDES
Germany

Aportación de la UE: EUR 813 000


CAMPUS
66041 SAARBRUCKEN
Germany
Activity type: Higher or Secondary Education Establishments
Contacto administrativo: Hans Uszkoreit
Tel.: 49 681 3025282
Fax: 49 681 3025338
Correo electrónico

Participantes

Univerzita Karlova v Praze
Czech Republic

Aportación de la UE: EUR 265 111


Ovocny trh
116 36 Prague 1
Czech Republic
Activity type: Higher or Secondary Education Establishments
Contacto administrativo: Jan Hajic
Tel.: +420 221914257
Fax: +420 221914304
Correo electrónico
GROUP TECHNOLOGIES AG
La participación finalizó
Germany

Aportación de la UE: EUR 102 391


HOSPITALSTR. 6
99817 EISENACH
Germany
Activity type: Other
Contacto administrativo: Michael Brandt
Tel.: +49 721/4901-0
Fax: +49 721/4901-99
Correo electrónico
MORPHOLOGIC SZAMITASTECHNIKAI KFT
Hungary

Aportación de la UE: EUR 106 000


KARDHEGY UTCA
1116 BUDAPEST
Hungary
Activity type: Private for-profit entities (excluding Higher or Secondary Education Establishments)
Contacto administrativo: Gábor Prószéky
Tel.: +3612252323
Fax: +3612252320
Correo electrónico
CENTRE FOR THE EVALUATION OF LANGUAGE AND COMMUNICATION TECHNOLOGIES SCRL
Italy

Aportación de la UE: EUR 94 636


VIA SOMMARIVE 18
38123 POVO - TRENTO
Italy
Activity type: Private for-profit entities (excluding Higher or Secondary Education Establishments)
Contacto administrativo: Danilo Giampiccolo
Tel.: +39 0461 405354
Fax: +39 0461 405372
Correo electrónico
THE UNIVERSITY OF EDINBURGH
United Kingdom

Aportación de la UE: EUR 685 250


OLD COLLEGE, SOUTH BRIDGE
EH8 9YL EDINBURGH
United Kingdom
Activity type: Higher or Secondary Education Establishments
Contacto administrativo: Koehn Philipp
Tel.: +44 131 6508287
Fax: +44 131 6506626
Correo electrónico
Síganos en: RSS Facebook Twitter YouTube Gestionado por la Oficina de Publicaciones de la UE Arriba