Skip to main content
European Commission logo print header

Automatic building of Machine Translation

Article Category

Article available in the following languages:

Cutting-edge research methods and tools to accelerate the deployment of machine translation systems

Europe faces one of its biggest challenges in multilingualism. To tackle this issue, an EU initiative strengthened industry and academia cooperation in order to reverse the low industrial adoption levels for machine translation (MT).

Digital Economy icon Digital Economy
Industrial Technologies icon Industrial Technologies
Society icon Society
Fundamental Research icon Fundamental Research

The EU-funded ABU-MATRAN (Automatic building of machine translation) project developed a set of tools to step up the development of statistical MT systems and rule-based MT systems. It also developed techniques to improve the translation quality for morphologically rich languages. Project partners provided MT for the Croatian language. This online MT system for English-Croatian based on publicly available resources was released in 2013 to mark the country’s accession to the EU. It was further improved, following the neural MT approach, and using data selection techniques and a rule-based MT system for Croatian-Serbian developed collaboratively during secondments. A Croatian MT system for tourism was also released. It was built using tourism parallel data acquired from the web crawlers developed during the project. Results show that this system outperforms other online MT systems for this language pair and domain. The ABU-MATRAN team then extended the Croatian MT system to related south Slavic languages of Candidate Countries. It released MT systems for the Bosnian, Serbian and Slovenian languages. The techniques developed for these three languages were applied successfully to other languages, thus demonstrating their high degree of language independence. Namely, researchers developed MT systems for English-Finnish, Spanish-Catalan and Spanish-Basque. Secondments were implemented and several workshops were organised to transfer knowledge on issues such as management and processes from industry to academia. These events aimed to make research and development products more robust, and focused on efficient software management and linguistic data creation. For effective knowledge transfer beyond the project, all resources were released as free/open-source software and are available on the website. ABU-MATRAN identified key research practices to accelerate the rollout of MT systems and prepare them for commercial exploitation. Just as important, it conveyed this knowledge to industry.

Keywords

Machine translation, ABU-MATRAN, web crawlers

Discover other articles in the same domain of application