Please note that the project factsheets will no longer be updated. All information relevant to the project can be found on the CORDIS factsheet . This is updated on a regular basis with public deliverables, etc.
MateCat - Machine Translation Enhanced Computer Assisted Translation
287688 - STREP
At a glance
FP7-ICT-2011-7 - Language technologies
The MateCat project will advance the CAT (Computer Assisted Translation) technology by investigating innovative methods of effective integration of SMT (Statistical Machine Translation) within the human translation workflow. The project builds on the open source toolkit Moses (an open source toolkit widely adopted by research labs and SMEs around the world) and will improve the SMT paradigm by investigating methods to make SMT aware of its use, self-tuning to the task (offline learning, domain adaptation), learning from the user feedback (online learning), and more informative.
The project will release its main outcomes in open source and set-up a User Group, including end-users, service providers, and technology developers to foster the uptake of project results
Objectives and Innovation
Today statistical MT is mainly trained with the objective of creating the most comprehensible output for a final user. MateCat aims to create an MT system whose output minimizes the time the translator needs to post-edit it.
The main challenges are around building an effective platform for collecting corrections and changes applied by translators; applying the state-of-the-art statistical modelling to extract correlations between translators' actions and MT output; understanding how this analysis can convert into a new MT training process and finally integrating these technologies into a single tool.
MateCat will go beyond state of the art by investigating new research issues related to the integration of MT into the CAT working process and advancing statistical MT technology along three main directions: self-tuning MT, user adaptive MT, and informative MT. Progress will be measured through field tests evaluating utility and usability of the enhanced MT by means of objective and subjective criteria. Field test will be carried out by professional translators working on real translation projects. Objective criteria will compare translation effort by users employing a CAT tool with and without MateCat enhanced MT technology.
Target Group of the project
Final users targeted by the project are professional translators, although the project results will be directly exploited by MT researchers and companies interested in the integration of MT within CAT tools. MateCat’s results will be released in open source and made available to a variety of organizations operating in the translation sector, including European SMEs offering translation services, translation offices of large public bodies and IT companies, and educational institutions.
MateCat will develop a new CAT tool integrating the novel MT functionalities developed in the project. Project results will be extensively tested in field trials conducted by the industrial partner in the project. The results of the lab and field tests will be made publicly available. The project will develop and evaluate new methods within the field of statistical MT which will be subject to scientific publications by academic partners and will be reported in the publicly available project deliverables. Algorithms implementing the new methods will be integrated in the three incremental versions of a CAT tool prototype. All prototypes will be released as open source.
MateCat aims at keeping up the competitiveness of Europe in the emerging application of statistical MT, in particular in the CAT scenario. MateCat aims at developing new SMT algorithms and further advancing Moses, the European open source platform of statistical MT, by allowing for its integration in commercial CAT software platforms, which are the de-facto standard of the world’s translation industry. Finally, the project partners expect that enhanced and more usable machine translation technology will definitely improve the European competitive position in the multilingual digital market, through the supply of better services to citizens and businesses.
Where the project will be present?
The project will be represented in the following relevant venues gathering both research and industry representatives:
- Conference of the European Association for Machine Translation
- Conference of the Association for Machine Translation in the Americas
- TAUS Summit
- Localization World
- Machine Translation Summit
This page is maintained by: Susan Fraser (email removed)