Skip to main content

STATISTICAL MACHINE TRANSLATION USING MONOLINGUAL CORPORA: FROM CONCEPT TO IMPLEMENTATION

Objective

METIS-II is the continuation of the successful assessment project METIS-I (IST-2001-32775). Like METIS-I, METIS-II also aims at constructing free text translations by relying on pattern matching techniques and by retrieving the basic stock for translations from large monolingual corpora. METIS-I has exceeded the capacity of a conventional Translation Memory.

METIS-II aims at further enhancing the system's performance and adaptability by:
1) breaking the sentence barriers: the system will retrieve pieces of sentences (chunks) and will stitch them together to produce a final translation; ways of reducing the system complexity will be sought;
2) extending the resources and integrating new languages;
3) using post-editing facilities;
4) adopting semi-automatic techniques for adapting the system to different translation needs;
5) taking into account real user needs Resources to be employed for these purposes other than the monolingual corpora and the basic algorithm include: bilingual lexica, taggers, lemmatizers, chunkers, and hand-crafted structure mapping rules (mapping abstract structures of the source language on to abstract structures of the target languages), DOP techniques and/or n-grams to stitch the received chunks together and post-editing facilities (eg. target language morphological generators).

The consortium is balanced in terms of languages and complementary skills in translation and MT issues, programming and linguistic knowledge. Furthermore, nearly all the necessary supportive tools are owned by the partners. METIS-II investigates the possibility of improving the translator's workbench component of applications which deal with multlinguality issues, which, in turn are central to the effort of satisfying the conditions for the information society. METIS-II is intented to develop an innovative MT system and attends to high risk, visionary and high risk research on efficient technology for handling multilinguality issues.

Coordinator

INSTITUTE FOR LANGUAGE AND SPEECH PROCESSING
Address
Epidavrou & Artemidos 6
15125 Maroussi - Athens
Greece

Participants (3)

Fundació Universitat Pompeu Fabra
Spain
Address
Plaça De La Mercè, 10 - 12
08002 Barcelona
GESELLSCHAFT ZUR FOERDERUNG DER ANGEWANDTEN INFORMATIONSFORSCHUNG E.V.
Germany
Address
Martin-luther-strasse 14
66111 Saarbruecken
KATHOLIEKE UNIVERSITEIT LEUVEN
Belgium
Address
Oude Markt 13
3000 Leuven