Community Research and Development Information Service - CORDIS

Cutting-edge research methods and tools to accelerate the deployment of machine translation systems

Europe faces one of its biggest challenges in multilingualism. To tackle this issue, an EU initiative strengthened industry and academia cooperation in order to reverse the low industrial adoption levels for machine translation (MT).
Cutting-edge research methods and tools to accelerate the deployment of machine translation systems
The EU-funded ABU-MATRAN (Automatic building of machine translation) project developed a set of tools to step up the development of statistical MT systems and rule-based MT systems. It also developed techniques to improve the translation quality for morphologically rich languages.

Project partners provided MT for the Croatian language. This online MT system for English-Croatian based on publicly available resources was released in 2013 to mark the country’s accession to the EU. It was further improved, following the neural MT approach, and using data selection techniques and a rule-based MT system for Croatian-Serbian developed collaboratively during secondments.

A Croatian MT system for tourism was also released. It was built using tourism parallel data acquired from the web crawlers developed during the project. Results show that this system outperforms other online MT systems for this language pair and domain.

The ABU-MATRAN team then extended the Croatian MT system to related south Slavic languages of Candidate Countries. It released MT systems for the Bosnian, Serbian and Slovenian languages. The techniques developed for these three languages were applied successfully to other languages, thus demonstrating their high degree of language independence. Namely, researchers developed MT systems for English-Finnish, Spanish-Catalan and Spanish-Basque.

Secondments were implemented and several workshops were organised to transfer knowledge on issues such as management and processes from industry to academia. These events aimed to make research and development products more robust, and focused on efficient software management and linguistic data creation. For effective knowledge transfer beyond the project, all resources were released as free/open-source software and are available on the website.

ABU-MATRAN identified key research practices to accelerate the rollout of MT systems and prepare them for commercial exploitation. Just as important, it conveyed this knowledge to industry.

Related information

Subjects

Life Sciences

Keywords

Machine translation, ABU-MATRAN, web crawlers
Follow us on: RSS Facebook Twitter YouTube Managed by the EU Publications Office Top