Periodic Reporting for period 2 - QT21 (QT21: Quality Translation 21)
Reporting period: 2016-08-01 to 2018-01-31
(1) substantially improved statistical and machine-learning based translation models for challenging languages and resource scenarios,
(2) improved evaluation and continuous learning from mistakes, guided by a systematic analysis of quality barriers, informed by human translators,
(3) all with a strong focus on scalability, to ensure that learning and decoding with these models is efficient and that reliance on data (annotated or not) is minimised.
To continuously measure progress, and to provide a platform for sharing and collaboration (QT21 internally and beyond), the project revolves around a series of Shared Tasks, for maximum impact co-organised with WMT.
The project has published 207 scientific papers, 88 were first-tier international conference papers, 16 journal and 24 shared task system papers, the remaining 79 were papers published at second tier conferences and workshops. QT21 PIs have been invited to deliver 32 international scientific keynotes and tutorials. In addition, they have been invited 28 times by companies and other non-scientific and public organisations (e.g. chambers of commerce) to present on the state-of-the-art in Machine Translation (MT). QT21 knowledge transfer to the industry has been supported by 19 talks at top tier international industry conferences including LocWorld, Tekom and GALA and more technically through 21 industry related workshops. In collaboration with GALA, QT21 produced 9 hours of live webinar content, attended by 718 and viewed by more than 1000 individuals.
QT21 has produced the largest data set available of Human Post Edits and Human Error Annotations, for four language pairs and all its software is open source and is available through its website.
The outcomes of QT21 are documented in 64 deliverables.
2) Main Objectives
QT21 focused on MT for challenging morphologically complex and syntactically varied languages. The project concentrated on 5 language pairs, 4 with English as source (English->German, English->Czech, English->Latvian, English->Romanian) and 1 with English as target language (German->English). In order to measure progress and compare QT21 with the international state-of-the-art (s-o-t-a), QT21 co-organises WMT 2016, 17 and 18 (the Workshop on Machine Translation http://statmt.org/wmt**) to benchmark MT technologies on shared tasks. Goal was to:
(1) improve statistical and machine-learning based translation models for challenging languages and resource scenarios;
(2) ensure that learning and decoding with these models is efficient and that reliance on data (annotated or not) is minimised;
(3) improve evaluation and continuous learning from mistakes, informed by human translators and post-editors, guided by a systematic analysis of quality barriers;
(4) provide a platform for sharing, collaboration and evaluation (QT21 internally and beyond), QT21 revolves around Shared Tasks, for maximum impact co-organised with WMT;
(5) support early technology transfer, QT21 has implemented a Technology Bridge linking ICT-17(a) and (b), showing technical feasibility of early research outputs in industry-focused environments.
3) Main Results Achieved:
(1) QT21 has made substantial contributions to Neural Machine Translation (NMT), pushing the state-of-the-art for NMT to comprehensively outperform the previous state-of-the-art held for many years by the family of Phrase-based Statistical MT (PB-SMT). Core technical contributions include “back translation” to produce synthetic training data, Byte Pair Encoding (BPE) to compress vocabularies of morphologically rich languages, and deeper recurrent neural networks. At the international competitions WMT16 and WMT17, QT21 systems won more than 80% of all shared tasks, outperforming large-scale commercial MT systems on En↔De, En↔Cz and En→Ro, the core languages of QT21.
(2) QT21 introduced back-translation (see Objective (1)), reducing the dependency on bi-lingual data. QT21 used BPE (see Objective (1)) improving MT for morphologically rich languages by significantly compressing the representation of the vocabulary, addressing the out-of-vocabulary (OOV) issues in automatic translation. QT21 showed that multi-lingual embeddings can efficiently support transfer learning for under-resourced languages. Further, QT21 work on inter-lingual factors opens the door to translating languages not seeing during training.
(3) QT21 systems won all WMT16 MT evaluation metrics tasks. In addition, QT21 won the WMT16 Quality Estimation (QE) shared task on “document level quality”. In the Automatic Post Editing (APE - learning from post-edits of professional translators) shared tasks. QT21 improved the baseline by 2.64 BLEU points with the 2nd best performance at WMT16 and won the WMT17 task improving the baseline by 7.6 BLEU points. A QT21 APE system that learns on-line from human post-editors further improves MT s-o-t-a by 1 to 2 BLEU points. QT21 further developed Direct Assessment (DA), showing that crowd sourcing can be a large-scale effective way of reliably evaluating MT systems. QT21 has harmonised the two major typologies for diagnostic MT error analysis, QT21’s own MQM (Multidimensional Quality Metrics) and TAUS’ industry standard DQF (Dynamic Quality Framework). QT21 revitalised interest in test suits in diagnostic MT evaluation.
(4) The organisation of WMT (co-organised with CRACKER-Horizon2020 # 645357) is at the core of this objective. The +48% increase in submissions from 2015 to 2017 on the main task (News Task) and the tripling of participation in the APE task between 2015 and 2017 shows the value and recognition WMT enjoys in the community.
(5) To implement the QT21 ICT-17 Technology Bridge, QT21 ran workshops on QT21 research outcomes and technologies with DGT (MT@EC), HimL (Horizon2020-ICT17b #644402), MMT (Horizon2020-ICT17b #645487), TraMOOC (Horizon-ICT17b #644333), and KConnect (Horizon2020-ICT15 #644753). All ICT-17(b) projects used QT21 engines and technologies. A joint QT21-HimL submission entered WMT16. DGT (MT@EC) is in the process of switching from SMT to NMT."
Also the traction the QT21 harmonised MQM-DQF error annotation paradigm is experiencing from the industry shows the impact of QT21 work in the translation industry.
New State Of The Art (SOTA) results have been obtained with “back translation”, Byte Pair Encoding (BPE), deeper networks, layer normalisation, factoring of linguistic information and to some extent with System Combination (Ensemble).
Automatic Post Editing has improved by +7.6 BLEU points. Further QT21 APE's system improves translations by 1 to 2 BLEU points on data that is not annotated (not using annotation information).