This site has been archived on
The Community Research and Development Information Service - CORDIS
Information & Communication Technologies

Language Technologies


Back to overview

Please note that the project factsheets will no longer be updated.  All information relevant to the project can be found on the CORDIS factsheet .  This is updated on a regular basis with public deliverables, etc.

LetsMT! - Platform for Online Sharing of Training Data and Building User Tailored MT

250456 - Pilot B

letsmt.gif

At a glance

PSP -2009.5.1 - Machine Translation for the Multilingual Web

Challenge

In recent years statistical machine translation (SMT) has provided a major breakthrough in machine translation development. SMT systems are built by analyzing huge volumes of parallel corpora and training translation models with this data. The quality of SMT systems largely depends on the size of the training data available. Since the great part of existing parallel corpora is in major languages, SMT systems for larger languages are of much better quality compared to systems for smaller languages. Current systems are built on the data accessible on the web, but it is just a fraction of all parallel texts - the majority still resides in the local systems of different corporations, public and private institutions, and desktops of individual users. The cost and the know-how required for building customised MT solutions deter many SMEs from utilizing the power of MT technologies.

Goal

To fully exploit the huge potential of existing open SMT technologies the project will build an innovative online collaborative platform for data sharing and providing MT solutions. It aims at a major breakthrough regarding the availability of parallel language resources and, consequently, machine translation services of good and acceptable quality. The goal is to target especially the less-covered languages where the current machine translation systems perform poorly due to limited availability of training data.

Innovation

The project will extend the use of existing state-of-the-art SMT methods that will be applied to data supplied by users in order to produce better-trained machine translation solutions. Let'sMT! will integrate Moses, the freeware statistical machine translation engine, and Giza++, the open-source corpus alignment tool, providing simple and easy-to-use human interfaces for this software.

The result

The project will deliver the following core functionalities:

  • website for upload of parallel corpora and building of specific MT solutions,
  • w ebsite for translation where source text can be inserted and translated,
  • translation widget provided for free inclusion into websites to translate their content,
  • browser plug-ins or add-ons that would allow the quickest access to translation,
  • web service for integration in CAT tools and other applications.

Some of the functionalities are already available here .

Impact

Let'sMT! is expected to diversify publicly available MT services for all languages by enabling them to be tailored to specialised domains and other user requirements. It will thus contribute towards a much wider application of MT for business users and in particular for SMEs.

Let'sMT! will particularly impact localisation industry and financial news services by providing cost effective MT solutions for smaller languages.

 

Co-ordinator

Contact Person:

Name: Aivars Berzins
Tel: +37167605001
Fax: +37167605750
E-mail
Organisation: Tilde

More»

Participants

 

 

 




















Back to overview



This page is maintained by: Susan Fraser (email removed)