Back to overview
LetsMT! - Platform for Online Sharing of Training Data and Building User Tailored MT
250456 - Pilot B

At a glance
PSP -2009.5.1 - Machine Translation for the Multilingual Web
- Duration: 30 months
- Start date: 1 March 2010
- End date: 31 August 2012
-
Project officer: Kimmo Rossi
- Website
- Let's MT platform
- Annual Report 2011
- Annual Report 2010
- Flyer
- Poster
At a glance
PSP -2009.5.1 - Machine Translation for the Multilingual Web
- Duration: 30 months
- Start date: 1 March 2010
- End date: 31 August 2012
- Project officer: Kimmo Rossi
- Website
- Let's MT platform
- Annual Report 2011
- Annual Report 2010
- Flyer
- Poster
Challenge
In recent years statistical machine translation (SMT) has provided a major breakthrough in machine translation development. SMT systems are built by analyzing huge volumes of parallel corpora and training translation models with this data. The quality of SMT systems largely depends on the size of the training data available. Since the great part of existing parallel corpora is in major languages, SMT systems for larger languages are of much better quality compared to systems for smaller languages. Current systems are built on the data accessible on the web, but it is just a fraction of all parallel texts - the majority still resides in the local systems of different corporations, public and private institutions, and desktops of individual users. The cost and the know-how required for building customised MT solutions deter many SMEs from utilizing the power of MT technologies.
Goal
To fully exploit the huge potential of existing open SMT technologies the project will build an innovative online collaborative platform for data sharing and providing MT solutions. It aims at a major breakthrough regarding the availability of parallel language resources and, consequently, machine translation services of good and acceptable quality. The goal is to target especially the less-covered languages where the current machine translation systems perform poorly due to limited availability of training data.
Innovation
The project will extend the use of existing state-of-the-art SMT methods that will be applied to data supplied by users in order to produce better-trained machine translation solutions. Let'sMT! will integrate Moses, the freeware statistical machine translation engine, and Giza++, the open-source corpus alignment tool, providing simple and easy-to-use human interfaces for this software.
The result
The project will deliver the following core functionalities:
- website for upload of parallel corpora and building of specific MT solutions,
- website for translation where source text can be inserted and translated,
- translation widget provided for free inclusion into websites to translate their content,
- browser plug-ins or add-ons that would allow the quickest access to translation,
- web service for integration in CAT tools and other applications.
Some of the functionalities are already available here.
Impact
Let'sMT! is expected to diversify publicly available MT services for all languages by enabling them to be tailored to specialised domains and other user requirements. It will thus contribute towards a much wider application of MT for business users and in particular for SMEs.
Let'sMT! will particularly impact localisation industry and financial news services by providing cost effective MT solutions for smaller languages.
| Co-ordinator |
Contact Person: Name: Aivars Berzins |
| Participants |
|
|
This page is maintained by: Susan Fraser
