Skip to main content

Cracking the Language Barrier: Coordination, Evaluation and Resources for European MT Research

Deliverables

Report on IWSLT 2016 and WMT 2016

Analysis of the 2016 shared tasks results, together with the training and test materials which will be uploaded to META-SHARE.

Report on IWSLT 2015

Analysis of the 2015 shared task results, together with the training and test materials which will be uploaded to META-SHARE.

New version of META-SHARE software (Update)

This deliverable will provide an extended and improved version of the META-SHARE platform with simplified operations for resource providers and depositors.

Report on IWSLT 2017 and WMT 2017

Analysis of the 2017 shared tasks results, together with the training and test materials which will be uploaded to META-SHARE.

New version of META-SHARE software

This deliverable will provide an extended and improved version of the META-SHARE platform with a streamlined and adapted data model as well as an adapted licensing toolkit.

Kick-off meeting of the ICT-17 group of funded projects

Organisation of a kick-off meeting of the whole ICT-17 pillar of funded projects, i.e., the big research project to be funded under ICT-17a and the innovation pilots to be funded under ICT-17b.

Tools for data creation and curation

This deliverable is an extended and improved version of the translate5 software that enables all ICT-17 projects to prepare and curate data sets, specifically with regard to information related to quality translation. It is foreseen to subcontract the company MittagQI.

Website for the QT initiative

The web portal for the QT initiative (group of ICT17 projects and other projects) will be the public face of the QT initiative which CRACKER will help to coordinate and to build a community around.

Project infrastructure (final version)

Completed version of project infrastructure (especially the website), which is specifically geared towards communication and dissemination purposes.

Project infrastructure (initial version)

Initial version of project infrastructure for the coordination of the project, including email lists, project website, intranet, etc.

Report on QT Marathon 2015

This report will provide data about the QT Marathon 2015 (number of participants, overview of talks etc.) and a summary of the research projects. In addition, final presentations of the project results will be uploaded to the web.

Report on META-FORUM 2015

Organisation of the conference META-FORUM 2015 with the help of a subcontractor; currently foreseen as organiser and location is the LSP Tilde in Riga, Latvia.

Report on META-FORUM 2017

Organisation of the conference META-FORUM 2017 with the help of a subcontractor; currently foreseen as organiser and location is the Hungarian Academy of Sciences in Budapest, Hungary.

Report on META-FORUM 2016

Organisation of the conference META-FORUM 2016 with the help of a subcontractor; currently foreseen as organiser and location is the University of Lisbon in Lisbon, Portugal.

Survey on the state of HQMT in industry and LSPs

This report summaries the results of the survey on the economic impact and uptake of recent EC-funded MT actions, especially with regard to industry and language service providers (LSPs). It is foreseen to subcontract TAUS and GALA.

Coordination with and support of MLi

This report will provide a summary of the support of and collaboration with MLi on resource infrastructures in relation to their deployment for building and offering multilingual digital services.

Coordination with and support of ICT-17a and ICT-17b projects

This deliverable will provide an overview of the coordination and support of the ICT-17 projects in their MT-related resource sharing activities

Report on QT Marathon 2016

This report will provide data about the QT Marathon 2016 (number of participants, overview of talks etc.) and a summary of the research projects. In addition, final presentations of the project results will be uploaded to the web.

Data Management Plan

The CRACKER Data Management Plan will provide the data management policy of the project with regard to the produced data sets, containing, among others, information on standards and metadata used as well as on sharing, archiving and preservation.

Data Management Plan (Update)

The CRACKER Data Management Plan will provide the data management policy of the project with regard to the produced data sets, containing, among others, information on standards and metadata used as well as on sharing, archiving and preservation.

Coordination with and support of LIDER

This deliverable will report on the support and coordination activities with LIDER in rendering the META-SHARE data model in RDF following the recommendations of the W3C.

Report on coordination between MT research and CEF

This report will provide an overview of the cooperation between the MT research community as assembled, among others, in the ICT-17 pillar of projects and initiatives such as META-NET and the Automated Translation activities in the digital component of CEF. The report will also contain a list of recommendations and action items for future cooperation.

Roadmap for European MT Research

Continuation of the roadmapping action for QT research in Europe, initiated by META-NET (2010-2013) and continued by QTLaunchPad (2012-2014); final version to be presented at META-FORUM 2017.

Strategic Research and Innovation Agenda for the LT/MT field

A joint deliverable with LT_Observatory.

Data Management Plan (Final Version)

The CRACKER Data Management Plan will provide the data management policy of the project with regard to the produced data sets, containing, among others, information on standards and metadata used as well as on sharing, archiving and preservation.

Position Paper and preliminary joint Strategic Research and Innovation Agenda for the LT/MT field

Position Paper prepared jointly and endorsed by CRACKER and LT_Observatory underpinned by a preliminary version of a joint Strategic Research and Innovation Agenda for the LT/MT field (SRIA). The form and content of these documents will be agreed between the CRACKER and LT_Observatory projects.

Publications

Mehrsprachigkeit für das Digitale Europa. Ringvorlesung Digitale Lebenswelten

Author(s): Georg Rehm
Published in: 2016

The role of Translators and Translation Technologies for the Multilingual Digital Single Market.

Author(s): Georg Rehm
Published in: Translating Europe Forum 2016, 2016

Overview of the IWSLT 22017 Evaluation Campaign

Author(s): M. Cettolo, M. Federico, L. Bentivogli, J. Niehues, S. Stüker, K. Sudoh, K. Yoshino, C. Federman
Published in: Proceedings of the 14th Workshop on Spoken Language Translation, Issue pp. 2-14, 2017

Round Table on Language Technologies

Author(s): Stelios Piperidis
Published in: Translating Europe Forum 2016, 2016

Neural versus Phrase-Based MT Quality: an In-Depth Analysis on English-German and English-French

Author(s): L. Bentivogli, A. Bisazza, M. Federico, M. Cettolo
Published in: Computer Speech and Language, Issue Vol. 49, 2018, Page(s) pp. 52-70

Technologies for Breaking Language Barriers in Europe. Beyond Language Barriers

Author(s): Georg Rehm and Stelios Piperidis
Published in: 2017

Artificial Intelligence for Translation Technologies and Multilingual Europe

Author(s): Georg Rehm and Josef van Genabith
Published in: Proceedings of the DG TRAD Conference - Translation Services in the Digital World: A Sneak Peek into the (near) Future, 2017

Acquisition and management of MT-related resources, Optimising Knowledge and Practice for Operational Language Resources

Author(s): Stelios Piperidis
Published in: 2016

Artificial Intelligence for Translation Technologies and Multilingual Europe

Author(s): Georg Rehm
Published in: 2017

Multilingual Europe in late 2016 - A Strategic Research and Innovation Agenda for the Multilingual Digital Single Market.

Author(s): Georg Rehm
Published in: Future and Emerging Trends in Language Technologies, Machine Learning and Big Data (FETLT 2016), 2016

Findings of the 2016 conference on machine translation.

Author(s): Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Aurelie Neveol, Mariana Neves, Martin Popel, Matt Post, Raphael Rubino, Carolina Scarton, Lu Karicia Specia, Marco Turchi,n Verspoor, and Marcos Zampieri
Published in: 2016

"The IWSLT 2016 Evaluation Campaign"", Proceedings of the 13th Workshop on Spoken Language Translation"

Author(s): M. Cettolo, J. Niehues, S. Stüker, L. Bentivogli, M. Federico
Published in: 2016, Page(s) pp. 14

CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies

Author(s): Zeman Daniel, Popel Martin, Straka Milan, Hajič Jan, Nivre Joakim, Ginter Filip, Luotolahti Juhani, Pyysalo Sampo, Petrov Slav, Potthast Martin, Tyers Francis, Badmaeva Elena, Gökırmak Memduh, Nedoluzhko Anna, Cinková Silvie, Hajič, jr. Jan, Hlaváčová Jaroslava, Kettnerová Václava, Urešová Zdeňka, Kanerva Jenna, Ojala Stina, Missilä Anna, Manning Christopher, Schuster Sebastian, Redd
Published in: Proceedings of the CoNLL 2017, 2017, Page(s) 1-19

META-SHARE als System zur Sicherung und Suche romanistischer Sprachressourcen. Workshop zum Forschungsdatenmanagement in der Romanistik

Author(s): Stelios Piperidis and Georg Rehm
Published in: 2017

Ten Years of WMT Evaluation Campaigns: Lessons Learnt.

Author(s): Ondrej Bojar, Christian Federmann, Barry Haddow, Philipp Koehn, Matt Post, Lucia Specia
Published in: Proceedings of the LREC 2016 Workshop “Translation Evaluation - From Fragmented Tools and Data Sets to an Integrated Ecosystem, 2016

Findings of the 2017 conference on machine translation (wmt17)

Author(s): Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Shujian Huang, Matthias Huck, Philipp Koehn, Qun Liu, Varvara Logacheva, Christof Monz, Matteo Negri, Matt Post, Raphael Rubino, Lucia Specia, and Marco Turchi
Published in: Proceedings of the Second Conference on Machine Translation, 2017, Page(s) pages 169-214

Language Technologies for Big Data - A Strategic Agenda for the Multilingual Digital Single Market. BDVA Summit (Big Data Value Association)

Author(s): Georg Rehm
Published in: 2016

Language Technology for Multilingual Europe: An Analysis of a Large-Scale Survey regarding Challenges, Demands, Gaps and Needs.

Author(s): Georg Rehm and Stefanie Hegele
Published in: Proceedings of the 11th Language Resources and Evaluation Conference (LREC 2018), 2018

Language Use in Public Administration - Theory and practice in the European states.

Author(s): Georg Rehm. Cracking the Language Barrier for a Multilingual Europe. In Pirkko Nuolijärvi and Gerhard Stickel, editors
Published in: Contributions to the Annual Conference 2015 of EFNIL in Helsinki, 2016, Page(s) 41-58

Ten Years of WMT Evaluation Campaigns: Lessons Learnt

Author(s): Ondřej Bojar, Christian Federmann, Barry Haddow, Philipp Koehn, Matt Post and Lucia Specia
Published in: Proceedings of the LREC 2016 Workshop Translation Evaluation: From Fragmented Tools and Data Sets to an Integrated Ecosystem, 2016

Findings of the 2016 Conference on Machine Translation

Author(s): Bojar, Ondrej and Chatterjee, Rajen and Federmann, Christian and Graham, Yvette and Haddow, Barry and Huck, Matthias and Jimeno Yepes, Antonio and Koehn, Philipp and Logacheva, Varvara and Monz, Christof and Negri, Matteo and Neveol, Aurelie and Neves, Mariana and Popel, Martin and Post, Matt and Rubino, Raphael and Scarton, Carolina and Specia, Lucia and Turchi
Published in: Proceedings of the First Conference on Machine Translation, 2016

Technology Landscape for Quality Evaluation: Combining the Needs of Research and Industry

Author(s): Kim Harris; Aljoscha Burchardt; Georg Rehm; Lucia Specia
Published in: Proceedings of the LREC 2016 Workshop “Translation Evaluation: From Fragmented Tools and Data Sets to an Integrated Ecosystem”, 2016, Page(s) 50-54

Translation Evaluation: From Fragmented Tools and Data Sets to an Integrated Ecosystem

Author(s): Georg Rehm; Aljoscha Burchardt; Ondrej Bojar; Christian Dugast; Marcello Federico; Josef van Genabith; Barry Haddow; Jan Hajic; Kim Harris; Philipp Koehn; Matteo Negri; Martin Popel; Lucia Specia; Marco Turchi; Hans Uszkoreit (eds.)
Published in: Proceedings of the LREC 2016 Workshop “Translation Evaluation: From Fragmented Tools and Data Sets to an Integrated Ecosystem”, 2016

Towards a Systematic and Human-Informed Paradigm for High-Quality Machine Translation

Author(s): Aljoscha Burchardt; Kim Harris; Georg Rehm; Hans Uszkoreit
Published in: Proceedings of the LREC 2016 Workshop “Translation Evaluation: From Fragmented Tools and Data Sets to an Integrated Ecosystem”, 2016

Fostering the Next Generation of European Language Technology: Recent Developments ― Emerging Initiatives ― Challenges and Opportunities

Author(s): Georg Rehm; Jan Hajic; Josef van Genabith; Andrejs Vasiljevs;
Published in: Proceedings of the Tenth International Conference on Language Resources and Evaluation,, 2016

The Language Resource Life Cycle: Towards a Generic Model for Creating, Maintaining, Using and Distributing Language Resources

Author(s): Georg Rehm
Published in: Proceedings of International Conference on Language Resources and Evaluation (LREC 2016), 2015

CRACKER - Cracking the Language Barrier: Coordination, Evaluation and Resources for European MT Research

Author(s): Georg Rehm
Published in: In Proceedings of the 19th Annual Conference of the European Association for Machine Translation (EAMT 2016), 2015

CRACKER: Cracking the Language Barrier

Author(s): Georg Rehm
Published in: Proceedings of the 18th Annual Conference of the European Association for Machine Translation, 2015

The IWSLT Evaluation Campaign: Challenges, Achievements, Future Directions

Author(s): L. Bentivogli, M. Federico, S. Stüker, M. Cettolo, J. Niehues
Published in: "Proceedings of the LREC 2016 Workshop ""Translation Evaluation - From Fragmented Tools and Data Sets to an Integrated Ecosystem""", 2016

The IWSLT 2015 Evaluation Campaign

Author(s): M. Cettolo, J. Niehues, S. Stüker, L. Bentivogli, R. Cattoni, M. Federico
Published in: Proceedings of the 12th Workshop on Spoken Language Translation, 2015

Neural versus Phrase-Based Machine Translation Quality: a Case Study

Author(s): L. Bentivogli, A. Bisazza, M. Cettolo, M. Federico
Published in: Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP), 2016

Word embeddings and discourse information for Quality Estimation

Author(s): Carolina Scarton, Daniel Beck, Kashif Shah, Karin Sim Smith, and Lucia Specia.
Published in: Proceedings of First Conference on Machine Translation (WMT16), Volume 2: Shared Task Papers, 2016

Findings of the 2016 Conference on Machine Translation

Author(s): Ondrej Bojar, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Aurelie Neveol, Mariana Neves, Martin Popel, Matt Post, Raphael Rubino, Carolina Scarton, Lucia Specia, Marco Turchi, Karin Verspoor, and Marcos Zampieri.
Published in: Proceedings of First Conference on Machine Translation (WMT16), Volume 2: Shared Task Papers, 2016

SHEF-Multimodal: Grounding Machine Translation on Images

Author(s): Kashif Shah, Josiah Wang, and Lucia Specia
Published in: Proceedings of First Conference on Machine Translation (WMT16), Volume 2: Shared Task Papers, 2016

SHEF-LIUM-NN: Sentence level Quality Estimation with Neural Network Features

Author(s): Kashif Shah, Fethi Bougares, Loic Barrault, and Lucia Specia.
Published in: Proceedings of First Conference on Machine Translation (WMT16), Volume 2: Shared Task Papers, 2016

SHEF-NN: Translation Quality Estimation with Neural Networks.

Author(s): Kashif Shah, Varvara Logacheva, Gustavo Paetzold, Frédéric Blain, Daniel Beck, Fethi Bougares, and Lucia Specia
Published in: Proceedings of Tenth Workshop on Statistical Machine Translation (WMT15), 2015

Investigating Continuous Space Language Models for Machine Translation Quality Estimation

Author(s): Kashif Shah, Raymond W.M. Ng, Fethi Bougares, and Lucia Specia
Published in: Proceedings of Conference on Empirical Methods in Natural Language Processing, 2015

Findings of the 2015 Workshop on Statistical Machine Translation.

Author(s): Ondrej Bojar, Rajen Chatterjee, Christian Federmann, Barry Haddow, Matthias Huck, Chris Hokamp, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Matt Post, Carolina Scarton, Lucia Specia, and Marco Turchi.
Published in: Proceedings of Tenth Workshop on Statistical Machine Translation (WMT15), 2015, Page(s) 1-46

Large-scale Multitask Learning for Machine Translation Quality Estimation

Author(s): Kashif Shah and Lucia Specia
Published in: Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016

Digital Representation of Rights for Language Resources

Author(s): Rodriguez-Doncel, V. and P. Labropoulou
Published in: Proceedings of the 4th Workshop on Linked Data in Linguistics: Resources and Applications, ACL-IJCNLP, 2015

One Ontology to Bind Them All: The META-SHARE OWL Ontology for the Interoperability of Linguistic Datasets on the Web




Pronoun-Focused MT and Cross-Lingual Pronoun Prediction: Findings of the 2015 DiscoMT Shared Task on Pronoun Translation

Author(s): Christian Hardmeier, Preslav Nakov, Sara Stymne, Jörg Tiedemann, Yannick Versley, Mauro Cettolo
Published in: In Proceedings of the EMNLP Second Workshop on Discourse in Machine Translation (DiscoMT), 2015, Page(s) pages 1–16

Enhancing Cross-border EU E-commerce through Machine Translation: Needed Language Resources, Challenges and Opportunities

Author(s): Fernández-Barrera, M., V. Popescu, A. Toral, F. Gaspari and K. Choukri (2016)
Published in: Proceedings of the 10th Language Resources and Evaluation Conference (LREC), 2016, Page(s) 4550-4556

Cracking The Language Barrier For A Multilingual Europe

Author(s): Georg Rehm
Published in: Language Use In Public Administration. Contributions To The Annual Conference 2015 Of EFNIL In Helsinki, 2016

Language Technologies for Multilingual Europe: Towards a Human Language Project. Strategic Research and Innovation Agenda. Version 1.0 2017

Author(s): Georg Rehm (ed.).
Published in: 2017

Language as a Data Type and Key Challenge for Big Data. Strategic Research and Innovation Agenda for the Multilingual Digital Single Market. Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing and curating natural language content

Author(s): Georg Rehm (ed.).
Published in: 2016

Human Language Technologies in a Multilingual Europe. Workshop Language Equality in the Digital Age - Towards a Human Language Project. Science and Technology Options Assessment (STOA)

Author(s): Georg Rehm
Published in: 2017

How Neural Machine Translation Can Unlock Europe’s Digital Single Market

Author(s): Georg Rehm, Rico Sennrich, Jan Hajic
Published in: Slator, 2016

Der Mensch bleibt im Mittelpunkt. Smarte Technologien für alle Branchen

Author(s): Georg Rehm
Published in: Vitako Aktuell. Zeitschrift der Bundes-Arbeitsgemeinschaft der Kommunalen IT-Dienstleister e.V., 2016

Language as a Data Type and Key Challenge for Big Data. Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing and curating natural language content

Author(s): Georg Rehm (ed.).
Published in: Strategic Research and Innovation Agenda for the Multilingual Digital Single Market, 2016

Technologies for Overcoming Language Barriers towards a truly integrated European Online Market

Author(s): Georg Rehm (ed.)
Published in: Strategic Agenda for the Multilingual Digital Single Market, 2015

The Strategic Impact of META-NET on the Regional, National and International Level.

Author(s): Georg Rehm, Hans Uszkoreit, Sophia Ananiadou, Núria Bel, Audronė Bielevičienė, Lars Borin, António Branco, Gerhard Budin, Nicoletta Calzolari, Walter Daelemans, Radovan Garabík, Marko Grobelnik, Carmen García-Mateo, Josef van Genabith, Jan Hajič, Inma Hernáez, John Judge, Svetla Koeva, Simon Krek, Cvetana Krstev, Krister Lindén, Bernardo Magnini, Joseph Mariani, John McNaught,
Published in: Language Resources and Evaluation, 2016, Page(s) 50(2):351-374

Language technologies for a multilingual Europe

Author(s): Georg Rehm, Felix Sasaki, Daniel Stein, Andreas Witt (Volume Editor)
Published in: Series: Translation and Multilingual Natural Language Processing, 2016