Skip to main content
European Commission logo print header

EXPloiting Empirical appRoaches to Translation

Final Report Summary - EXPERT (EXPloiting Empirical appRoaches to Translation)

The main aim of the EXPERT (EXPloiting Empirical appRoaches to Translation) project was to train researchers, namely Early Stage Researchers (ESRs) and Experienced Researchers (ERs), to promote the research, development and use of hybrid language translation technologies, and create future world leaders in the field. The overall objective was to provide innovative research and training in the fields of Translation Memory (TM) and Machine Translation (MT) to 15 Marie Curie Fellows:
• 12 PhD students (Early Stage Researchers - ESRs) with 36 month contracts.
• 3 post-doctoral researchers (Experienced Researchers - ERs) with one 24 month contract, and two 18 month contracts.

The Fellows participated in an integrated research programme of 15 individual projects, each addressing a different aspect of data-driven hybrid machine translation. The researchers had access to a vibrant training programme, which consisted of four large training events that were run across the whole consortium and engaged all the fellows: (1) Scientific and technological training, (2) Complementary skills training, (3) Scientific and technological workshop and (4) Business showcase.

In addition they were involved in intersectoral and transnational mobilities via secondments and shorter visits to industrial and academic partners. Each researcher received training from their hosting institutions and all the ESRs were registered on doctoral programmes. The feedback received from the fellows indicates that they appreciated very much both the overall scheme and the training they received. At the end of the project, one of the ESR has received his PhD and all the others are progressing very well with their theses, with few of them having their vivas’ scheduled within the next 3 months. All the researchers, but two, have found employment by the end of the project more or less equally distributed between academia and industry. Some of them stayed at the same institution that appointed them initially, whilst others moved on. The vast majority are working in the field of translation or in related areas. The two fellows who are not currently employed decided to focus on their theses and look for employment only after they submit them.

Project Objectives:

The main research objectives of EXPERT were twofold:
1. To improve existing corpus-based TM and MT technologies by addressing their well-known shortcomings via the use of more sophisticated levels of linguistic processing, terminology and domain knowledge, along with better consideration of user requirements, in order to improve translation quality and user satisfaction, and allow quick development for new language pairs.
2. To create hybrid technologies which incorporate the main features of corpus-based approaches, minimizing human translators’ effort and tailoring the system according to their needs, by allowing different levels of “assistance” to be provided in a user-friendly workflow.

The project was coordinated by the University of Wolverhampton and consists of 6 academic partners (University of Malaga, University of Sheffield, University of Saarlandes, University of Amsterdam, Dublin City University, three companies (Translated, Hermes and Pangeanic) and four associated partners (eTrad, Wordfast, Unbabel and DFKI).

After addressing the difficulties caused by the late recruitment of some of the researchers, the project has progressed smoothly and delivered all its milestones and deliverables. All the fellows were appointed by the partners, with some changes in order to accommodate for the delays. One of the changes required was to split one of the ERs’ projects into three smaller projects, each lasting between 4 and 6 months and covered by three different researchers. Despite this, the same outputs as the initial individual project aimed to produce were delivered.

The consortium has delivered all the four consortium wide training events scheduled. They were all well attended by both members of the network and participants from outside the network. Feedback collected after the events indicated that most of the audience very much appreciated them. In order to have a better training experience, the fellows completed transnational and intersectoral secondments. Continuous discussions were carried out within the consortium in order to ensure that fellows participated in secondments from which they would benefit the most, even if this meant a change to the original schedule. The reports submitted by the Fellows after the secondments indicate that we managed to achieve this goal quite well.

Several management meetings were organised to ensure the project kept to its objectives. There has been a logo, a website and mailing lists created to give the project both an identity and a method of communication for both internal consortium members and the wider research community. All the reports required in the project were submitted on time and accepted by EC.

Impact:

Automatic translation is an undeniable need in a globalised world where communication using several languages becomes increasingly relevant. TM and MT systems are the two most elaborate technologies to support human translation. Recent developments have shown the potential of data-driven approaches for producing fast and low cost translations. A number of user studies have however established shortcomings in the technology state-of-the-art, including poor quality translations for low resource languages, interfaces that do not take into account user requirements and user feedback, etc.

The individual projects carried out by the EXPERT fellows have produced research that has advanced the state of the art related to data-driven and hybrid machine translation. In addition to the over 160 publications produced in the project, several resources and tools have been made available to the research community. Some of the research carried out in the project has been trialled by the industrial partners in the project and incorporated in their day-to-day activities. One of the main innovations of the project is ActivaTM developed at Pangeanic which promotes a new concept in translation memory database management. The work was labelled as revolutionary by experts in the field.

The EXPERT project has created an Initial Training Network to train young researchers on ways to improve current data-driven MT technologies by exploiting their individual strengths through their combination and by addressing some of the main limitations of each of these technologies. We expect the training of researchers in the new skills required for the development and use of technologies that can increase productivity and reduce costs in the translation sector, as well as facilitate reliable communication and content creation in multiple languages, will contribute to several aspects of Europe’s ICT development. In addition it has already been noted that the career development prospects of the Fellows involved are likely to be remarkably enhanced thanks to their participation in the project as it will contribute not only to the advance of their networking capacities but also their research and complementary skills.

The projects public website can be found at the following web address: http://expert-itn.eu.