
CRACKER
Project ID: 645357Financé au titre de:
Cracking the Language Barrier: Coordination, Evaluation and Resources for European MT Research
Détails concernant le projet
Coût total:
EUR 999 995Contribution de l'UE:
EUR 999 995Coordonné à/au(x)/en:
GermanyAppel à propositions:
H2020-ICT-2014-1See other projects for this callRégime de financement:
CSA - Coordination and support actionObjectif
The European machine translation (MT) research community is experiencing increased pressure for rapid success – from the legal and political frameworks and schedules of the EU, such as the Digital Single Market, but also from the globalising business world. At the same time, the research community has to cope with a striking disproportion between the scope of the challenges and the available resources, especially for translation to and from languages that have only fragmentary or no technological support at all.
CRACKER pushes towards an improvement of MT research in terms of efficiency and effectiveness by implementing the successful example of other disciplines where massively collaborative research on shared resources – guided by interoperability, standardisation, agreed major challenges and comprehensive success metrics – has led to breakthroughs that would have been impossible otherwise. The nucleus of this new research, development, and innovation strategy towards high-quality MT is the group of projects funded through H2020-ICT-17a/b (partly extending to relevant FP7 actions such as QTLeap, LIDER and MLi), that will be supported by CRACKER (ICT-17c) in coordination, evaluation and resources.
In order to achieve its challenging goals efficiently, CRACKER will build upon, consolidate and extend initiatives for collaborative MT research supported by earlier EU-funded actions. These include evaluation campaigns such as the Workshop on Statistical Machine Translation (WMT) and the International Workshop on Spoken Language Translation (IWSLT), the META-SHARE open infrastructure for sharing language resources and technologies with extensions for MT assembled by QTLaunchPad, and open-source tool building and training (MT Marathons). Coordination, communication and outreach to user communities will build upon existing networks and communication infrastructures such as the META-FORUM event series and strong involvement of industrial associations such as GALA and TAUS.
Deliverables
-
Report on IWSLT 2015
Analysis of the 2015 shared task results, together with the training and test materials which will be uploaded to META-SHARE.
-
New version of META-SHARE software
This deliverable will provide an extended and improved version of the META-SHARE platform with a streamlined and adapted data model as well as an adapted licensing toolkit.
-
Kick-off meeting of the ICT-17 group of funded projects
Organisation of a kick-off meeting of the whole ICT-17 pillar of funded projects, i.e., the big research project to be funded under ICT-17a and the innovation pilots to be funded under ICT-17b.
-
Website for the QT initiative
The web portal for the QT initiative (group of ICT17 projects and other projects) will be the public face of the QT initiative which CRACKER will help to coordinate and to build a community around.
-
Project infrastructure (final version)
Completed version of project infrastructure (especially the website), which is specifically geared towards communication and dissemination purposes.
-
Project infrastructure (initial version)
Initial version of project infrastructure for the coordination of the project, including email lists, project website, intranet, etc.
-
Report on QT Marathon 2015
This report will provide data about the QT Marathon 2015 (number of participants, overview of talks etc.) and a summary of the research projects. In addition, final presentations of the project results will be uploaded to the web.
-
Report on META-FORUM 2015
Organisation of the conference META-FORUM 2015 with the help of a subcontractor; currently foreseen as organiser and location is the LSP Tilde in Riga, Latvia.
-
Report on META-FORUM 2016
Organisation of the conference META-FORUM 2016 with the help of a subcontractor; currently foreseen as organiser and location is the University of Lisbon in Lisbon, Portugal.
-
Survey on the state of HQMT in industry and LSPs
This report summaries the results of the survey on the economic impact and uptake of recent EC-funded MT actions, especially with regard to industry and language service providers (LSPs). It is foreseen to subcontract TAUS and GALA.
-
Coordination with and support of MLi
This report will provide a summary of the support of and collaboration with MLi on resource infrastructures in relation to their deployment for building and offering multilingual digital services.
-
Data Management Plan
The CRACKER Data Management Plan will provide the data management policy of the project with regard to the produced data sets, containing, among others, information on standards and metadata used as well as on sharing, archiving and preservation.
-
Data Management Plan (Update)
The CRACKER Data Management Plan will provide the data management policy of the project with regard to the produced data sets, containing, among others, information on standards and metadata used as well as on sharing, archiving and preservation.
-
Coordination with and support of LIDER
This deliverable will report on the support and coordination activities with LIDER in rendering the META-SHARE data model in RDF following the recommendations of the W3C.
-
Strategic Research and Innovation Agenda for the LT/MT field
A joint deliverable with LT_Observatory.
-
Position Paper and preliminary joint Strategic Research and Innovation Agenda for the LT/MT field
Position Paper prepared jointly and endorsed by CRACKER and LT_Observatory underpinned by a preliminary version of a joint Strategic Research and Innovation Agenda for the LT/MT field (SRIA). The form and content of these documents will be agreed between the CRACKER and LT_Observatory projects.
Publications
-
One Ontology to Bind Them All: The META-SHARE OWL Ontology for the Interoperability of Linguistic Datasets on the WebAuthor(s): McCrae, John P., P. Labropoulou, J. Gracia, M. Villegas, V. Rodriguez-Doncel & P. CimianoPublished in: The Semantic Web: ESWC 2015 Satellite Events, 2015.
-
Fostering the Next Generation of European Language Technology: Recent Developments ― Emerging Initiatives ― Challenges and OpportunitiesAuthor(s): Georg Rehm; Jan Hajic; Josef van Genabith; Andrejs Vasiljevs;Published in: Proceedings of the Tenth International Conference on Language Resources and Evaluation,, 2016.
-
Cracking The Language Barrier For A Multilingual EuropeAuthor(s): Georg RehmPublished in: Language Use In Public Administration. Contributions To The Annual Conference 2015 Of EFNIL In Helsinki, 2016.
-
The Language Resource Life Cycle: Towards a Generic Model for Creating, Maintaining, Using and Distributing Language ResourcesAuthor(s): Georg RehmPublished in: Proceedings of International Conference on Language Resources and Evaluation (LREC 2016), 2015.
-
Enhancing Cross-border EU E-commerce through Machine Translation: Needed Language Resources, Challenges and OpportunitiesAuthor(s): Fernández-Barrera, M., V. Popescu, A. Toral, F. Gaspari and K. Choukri (2016)Published in: Proceedings of the 10th Language Resources and Evaluation Conference (LREC), 2016. Page(s) 4550-4556.
-
Investigating Continuous Space Language Models for Machine Translation Quality EstimationAuthor(s): Kashif Shah, Raymond W.M. Ng, Fethi Bougares, and Lucia SpeciaPublished in: Proceedings of Conference on Empirical Methods in Natural Language Processing, 2015.
-
SHEF-Multimodal: Grounding Machine Translation on ImagesAuthor(s): Kashif Shah, Josiah Wang, and Lucia SpeciaPublished in: Proceedings of First Conference on Machine Translation (WMT16), Volume 2: Shared Task Papers, 2016.
-
Neural versus Phrase-Based Machine Translation Quality: a Case StudyAuthor(s): L. Bentivogli, A. Bisazza, M. Cettolo, M. FedericoPublished in: Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP), 2016.
-
Large-scale Multitask Learning for Machine Translation Quality EstimationAuthor(s): Kashif Shah and Lucia SpeciaPublished in: Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016.
-
Technology Landscape for Quality Evaluation: Combining the Needs of Research and IndustryAuthor(s): Kim Harris; Aljoscha Burchardt; Georg Rehm; Lucia SpeciaPublished in: Proceedings of the LREC 2016 Workshop “Translation Evaluation: From Fragmented Tools and Data Sets to an Integrated Ecosystem”, 2016. Page(s) 50-54.
-
Word embeddings and discourse information for Quality EstimationAuthor(s): Carolina Scarton, Daniel Beck, Kashif Shah, Karin Sim Smith, and Lucia Specia.Published in: Proceedings of First Conference on Machine Translation (WMT16), Volume 2: Shared Task Papers, 2016.
-
Translation Evaluation: From Fragmented Tools and Data Sets to an Integrated EcosystemAuthor(s): Georg Rehm; Aljoscha Burchardt; Ondrej Bojar; Christian Dugast; Marcello Federico; Josef van Genabith; Barry Haddow; Jan Hajic; Kim Harris; Philipp Koehn; Matteo Negri; Martin Popel; Lucia Specia; Marco Turchi; Hans Uszkoreit (eds.)Published in: Proceedings of the LREC 2016 Workshop “Translation Evaluation: From Fragmented Tools and Data Sets to an Integrated Ecosystem”, 2016.
-
Findings of the 2016 Conference on Machine TranslationAuthor(s): Bojar, Ondrej and Chatterjee, Rajen and Federmann, Christian and Graham, Yvette and Haddow, Barry and Huck, Matthias and Jimeno Yepes, Antonio and Koehn, Philipp and Logacheva, Varvara and Monz, Christof and Negri, Matteo and Neveol, Aurelie and Neves, Mariana and Popel, Martin and Post, Matt and Rubino, Raphael and Scarton, Carolina and Specia, Lucia and Turchi, Marco and Verspoor, Karin and Zampieri, MarcosPublished in: Proceedings of the First Conference on Machine Translation, 2016.
-
Pronoun-Focused MT and Cross-Lingual Pronoun Prediction: Findings of the 2015 DiscoMT Shared Task on Pronoun TranslationAuthor(s): Christian Hardmeier, Preslav Nakov, Sara Stymne, Jörg Tiedemann, Yannick Versley, Mauro CettoloPublished in: In Proceedings of the EMNLP Second Workshop on Discourse in Machine Translation (DiscoMT), 2015. Page(s) pages 1–16.
-
Digital Representation of Rights for Language ResourcesAuthor(s): Rodriguez-Doncel, V. and P. LabropoulouPublished in: Proceedings of the 4th Workshop on Linked Data in Linguistics: Resources and Applications, ACL-IJCNLP, 2015.
-
SHEF-LIUM-NN: Sentence level Quality Estimation with Neural Network FeaturesAuthor(s): Kashif Shah, Fethi Bougares, Loic Barrault, and Lucia Specia.Published in: Proceedings of First Conference on Machine Translation (WMT16), Volume 2: Shared Task Papers, 2016.
-
SHEF-NN: Translation Quality Estimation with Neural Networks.Author(s): Kashif Shah, Varvara Logacheva, Gustavo Paetzold, Frédéric Blain, Daniel Beck, Fethi Bougares, and Lucia SpeciaPublished in: Proceedings of Tenth Workshop on Statistical Machine Translation (WMT15), 2015.
-
The IWSLT 2015 Evaluation CampaignAuthor(s): M. Cettolo, J. Niehues, S. Stüker, L. Bentivogli, R. Cattoni, M. FedericoPublished in: Proceedings of the 12th Workshop on Spoken Language Translation, 2015.
-
Findings of the 2015 Workshop on Statistical Machine Translation.Author(s): Ondrej Bojar, Rajen Chatterjee, Christian Federmann, Barry Haddow, Matthias Huck, Chris Hokamp, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Matt Post, Carolina Scarton, Lucia Specia, and Marco Turchi.Published in: Proceedings of Tenth Workshop on Statistical Machine Translation (WMT15), 2015. Page(s) 1-46.
-
Towards a Systematic and Human-Informed Paradigm for High-Quality Machine TranslationAuthor(s): Aljoscha Burchardt; Kim Harris; Georg Rehm; Hans UszkoreitPublished in: Proceedings of the LREC 2016 Workshop “Translation Evaluation: From Fragmented Tools and Data Sets to an Integrated Ecosystem”, 2016.
-
Ten Years of WMT Evaluation Campaigns: Lessons LearntAuthor(s): Ondřej Bojar, Christian Federmann, Barry Haddow, Philipp Koehn, Matt Post and Lucia SpeciaPublished in: Proceedings of the LREC 2016 Workshop Translation Evaluation: From Fragmented Tools and Data Sets to an Integrated Ecosystem, 2016.
-
The IWSLT Evaluation Campaign: Challenges, Achievements, Future DirectionsAuthor(s): L. Bentivogli, M. Federico, S. Stüker, M. Cettolo, J. NiehuesPublished in: "Proceedings of the LREC 2016 Workshop ""Translation Evaluation - From Fragmented Tools and Data Sets to an Integrated Ecosystem""", 2016.
-
Findings of the 2016 Conference on Machine TranslationAuthor(s): Ondrej Bojar, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Aurelie Neveol, Mariana Neves, Martin Popel, Matt Post, Raphael Rubino, Carolina Scarton, Lucia Specia, Marco Turchi, Karin Verspoor, and Marcos Zampieri.Published in: Proceedings of First Conference on Machine Translation (WMT16), Volume 2: Shared Task Papers, 2016.
-
CRACKER - Cracking the Language Barrier: Coordination, Evaluation and Resources for European MT ResearchAuthor(s): Georg RehmPublished in: In Proceedings of the 19th Annual Conference of the European Association for Machine Translation (EAMT 2016), 2015.
-
Technologies for Overcoming Language Barriers towards a truly integrated European Online MarketAuthor(s): Georg Rehm (ed.)Published in: Strategic Agenda for the Multilingual Digital Single Market, 2015.
-
Der Mensch bleibt im Mittelpunkt. Smarte Technologien für alle BranchenAuthor(s): Georg RehmPublished in: Vitako Aktuell. Zeitschrift der Bundes-Arbeitsgemeinschaft der Kommunalen IT-Dienstleister e.V., 2016.
-
Language as a Data Type and Key Challenge for Big Data. Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing and curating natural language contentAuthor(s): Georg Rehm (ed.).Published in: Strategic Research and Innovation Agenda for the Multilingual Digital Single Market, 2016.
-
Language technologies for a multilingual EuropeAuthor(s): Georg Rehm, Felix Sasaki, Daniel Stein, Andreas Witt (Volume Editor)Published in: Series: Translation and Multilingual Natural Language Processing, 2016.
Open Access
Open Access
Coordinateur
Contribution de l'UE: EUR 555 688,75
TRIPPSTADTER STRASSE 122
67663 KAISERSLAUTERN
Germany
Participants
Contribution de l'UE: EUR 85 000
OVOCNY TRH 5/3
11636 PRAHA 1
Czech Republic
Contribution de l'UE: EUR 60 000
9 RUE DES CORDELIERES
75013 PARIS
France
Contribution de l'UE: EUR 74 875
VIA SANTA CROCE 77
38122 TRENTO
Italy
Contribution de l'UE: EUR 60 000
ARTEMIDOS 6 KAI EPIDAVROU
151 25 MAROUSSI
Greece
Contribution de l'UE: EUR 89 431,25
OLD COLLEGE, SOUTH BRIDGE
EH8 9YL EDINBURGH
United Kingdom
Contribution de l'UE: EUR 75 000
FIRTH COURT WESTERN BANK
S10 2TN SHEFFIELD
United Kingdom
Dernière mise à jour le: 2017-09-11
Numéro d'enregistrement: 194311
Dernière mise à jour le 2017-09-11Share this page