Skip to main content
European Commission logo print header

Global Under-Resourced MEedia Translation

Project description

Improving neural machine translation for low-resource language

In a world where accurate and timely information has become imperative, journalists are in constant need of the appropriate tools of fast and accurate translation for languages with very few resources. Although neural machine translation technology advances rapidly, it has not managed yet to deliver usable translations for most language pairs in the world due to the lack of data and parallel corpora. The EU-funded GoURMET aims at improving the robustness and applicability of neural machine translation for low-resource language pairs and domains. The project will focus on global content creation by providing machine translations for correction by humans and on international news media monitoring for low-resource language pairs.

Objective

Machine translation (MT) is an increasingly important technology for supporting communication in a globalised world. MT technology has gradually increased over the last ten years, but recent advances in neural machine translation (NMT), have resulted in significant interest in industry and have lead to very rapid adoption of the new paradigm (eg. Google, Facebook, UN, World International Patent Office). Although these models have shown significant advances in state-of-the-art performance they are data intensive and require parallel corpora of many millions of human translated sentences for training. Neural Machine translation is currently not able to deliver usable translations for the vast majority of language pairs in the world. This is especially problematic for our user partners, the BBC and DW who need access to fast and accurate translation for languages with very few resources.

The aim of GoURMET is to significantly improve the robustness and applicability of neural machine translation for low-resource language pairs and domains.

GoURMET has five objectives:
- Development of a high-quality machine translation for under-resourced language pairs and domains;
- Adaptable to new and emerging languages and domains;
- Development of tools for analysts and journalists;
- Sustainable, maintainable platform and services;
- Dissemination and communication of project results to stakeholders and user group.

The project will focus on two use cases:
- Global content creation - managing content creation in several languages efficiently by providing machine translations for correction by humans;
- Media monitoring for low resource language pairs - tools to address the challenge of international news monitoring problem.

The outputs of the project will be field-tested at partners BBC and DW, and the platform will be further validated through innovation intensives such as the BBC NewsHack.

Call for proposal

H2020-ICT-2018-20

See other projects for this call

Sub call

H2020-ICT-2018-2

Coordinator

THE UNIVERSITY OF EDINBURGH
Net EU contribution
€ 948 047,50
Address
OLD COLLEGE, SOUTH BRIDGE
EH8 9YL Edinburgh
United Kingdom

See on map

Region
Scotland Eastern Scotland Edinburgh
Activity type
Higher or Secondary Education Establishments
Links
Total cost
€ 948 047,50

Participants (4)