Data Management Plan (version 1)

D5.4 will mostly contain the template to be used within ELG and also by all providers to document the resources they supply to the ELG. It will account for all issues mentioned above (documentation, metadata description, legal aspects, e.g., licensing condition and access, preservation, sustainability plan, etc.).

Market place report (setup and progress report)

D7.3 will report on the set up and progress towards establishing the ELG as the primary LT market place in Europe. It will report on all activities carried out, events participated in, presentations given, first success stories, obstacles etc.

User requirements and functional specifications

D2.1 (outcome of T2.1) will be a report consisting of a short description of the ELG target user groups, requirements for the prioritised ELG functionalities per user group, functional specifications for the ELG platform services.

NCC report and LTB report

D7.2 will report on the set up of the NCC network of networks, it will list the NCC Leads and additional members as well as activities carried out with or by the NCCs. D7.2 will also include an analysis of the setup procedure and the engagement of the NCC Leads. Furthermore, D7.2 will report on the set up of the European LT Board.

Sustainability plan (progress report)

D7.8 will provide the intermediate progress regarding T7.5. It will systematically list all different options towards establishing a legal entity to guarantee the sustainability of the ELG platform with all their individual pros and cons.

Pilot calls setup

D6.1 will provide a blueprint for organizing the pilots from the calls, panel and expert group (pilot board), management, submission procedure, help for the applicants, timelines, execution and evaluation. The content will be ready by the time the first call will be issued, but it will contain also all possible changes between the first and second calls. D6.1 will thus be essential for smooth and successful pilots demonstrating the use and usability of the ELG.

Requirements and architectural specification of the base infrastructure

D1.1 (to be assigned to the subcontractor) will present the outcome of T1.1. It will report on the software and hardware requirements of the base infrastructure and present the architectural specification of the infrastructure and its interaction with the ELG system (including a diagram, with all ELG components). D1.1 will present and justify the selected solutions for (a) creating, managing, configuring a cloud platform (e.g., OpenStack plus anything else required); (b) deploying, scaling, executing containerized applications (e.g., Kubernetes); (c) any other software complementing the selected architecture (e.g., private Docker registries, load balancers etc.); (d) software for monitoring the ELG components, i.e., ELG platform, metadata database, metadata index, LT services execution nodes (e.g., Kubernetes nodes). The justification must include benefits and drawbacks, results from experiments, comparison and contrast with alternatives, etc.

ELG Conference 2019 and LTB Meeting 2019

D7.5 reports on the ELG Conference 2019 and the co-located LTB Meeting 2019 (participants, stakeholders, results etc.).

Requirements and design guidelines

This report will consist of two parts: short description of the GUI features, user stories and design sketches with further development and design guidelines. D3.1 will also contain a description of the overall architecture for the GUI, including the suggested web technology stack. D3.1 is the outcome of T3.1 and T3.2.

Platform GUI (initial release)

Software code and report. The first release of the platform GUI will include the initial version of CMS system from T3.3 and a basic set of GUI features for the portal (T3.4). D3.2 will populate the public information channel with initial web content.

Metadata schema

D2.3 (result of T2.2) will consist of the metadata schema and a report. The schema, to be delivered in the form of XSD and RDF/OWL, will cater for the description of all resources and entities envisaged by ELG. It will build upon the META-SHARE schema (and its profiles) and the model of LT-World. Its specification will be based on D2.1, on the technical requirements for achieving interoperability and execution of services (T2.5) as well as on legal considerations for accessing resources (T5.4). The design of the schema will be modular (per resource type), flexible and aiming at standardisation (deploying controlled vocabularies, especially the widely accepted ones). The schema will include mappings to popular schemas and schemas used by the partners and initiatives that will share their resources with ELG.

Services, Tools and Components (first release)

D4.1 will consist of software code and a report. D4.1 will include the collected and prioritised set of existing services, tools, and components (result of T4.1). All tools, services, and components will be classified by type (e.g., MT, IE, ASR), languages addressed, licensing information, code/tool location, programming language/REST API details, etc. D4.1 will include a first set of tools and services integrated within the ELG (result of T4.2-4.5). At this point, the goal is to prove successful integration with the ELG platform and identify any issues with APIs, containerisation process, etc. and feed those back to WP2 for refinement in subsequent releases. Task leaders will be responsible for integration of at least five tools and services for each LT category (i.e., task).

Identification and collection of existing data sets, models, identified gaps and plans (version 1)

D5.1 will report on the identified resources and the associated identification methodology. The ELG platform will document all metadata elements needed. A summary of the results and a description of the methodology will be provided.

ELG platform (first release)

Software code and report. The first release of the platform will include the backend components required for the operation of the catalogue, i.e., the user management component (part of T2.8), components supporting documentation, uploading, storing and downloading of all resource types (tools and services, datasets, etc.), and APIs required for interacting with other layers (T2.3 and T2.4). An alpha release will be made internally available at M14 for testing with the upload of a limited sets of tools and services (D4.1) and datasets (WP5). D2.4 will introduce the platform, and include the first version of the guidelines on its use and provision of resources, instructions for containerisation and invoking of remotely accessible web services.

Base Infrastructure (first release)

A first version of the base infrastructure (VMs, networking, storage) will be delivered to the ELG partners. All required support tools for developing the ELG software will be installed, configured and delivered; e.g., source control management and a CI server. The report will briefly present all development/support tools and the setup of the base infrastructure.

Towards an Interoperable Ecosystem of AI and LT Platforms: A Roadmap for the Implementation of Different Levels of Interoperability

Author(s): Georg Rehm; Dimitris Galanis; Penny Labropoulou; Stelios Piperidis; Martin Welß; Ricardo Usbeck; Joachim Köhler; Miltos Deligiannis; Katerina Gkirtzou; Johannes Fischer; Christian Chiarcos; Nils Feldhus; Julian Moreno-Schneider; Florian Kintzel; Elena Montiel-Ponsoda; Víctor Rodriguez-Doncel; John Philip McCrae; David Laqua; Irina Patricia Theile; Christian Dittmar; Kalina Bontcheva; Ian Robert
Published in: Proceedings of the 1st International Workshop on Language Technology Platforms, 2020
DOI: 10.5281/zenodo.3842629

The University of Edinburgh’s Submissions to the WMT19 News Translation Task

Author(s): Rachel Bawden, Nikolay Bogoychev, Ulrich Germann, Roman Grundkiewicz, Faheem Kirefu, Antonio Valerio Miceli Barone, Alexandra Birch
Published in: Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), 2019, Page(s) 103-115
DOI: 10.18653/v1/w19-5304

Lattice-Based Unsupervised Test-Time Adaptation of Neural Network Acoustic Models

Author(s): Klejch, Ondrej; Fainberg, Joachim; Bell, Peter; Renals, Steve
Published in: Proceedings Interspeech 2019, 2019, Page(s) 1596-1600

Proceedings of the 1st International Workshop on Language Technology Platforms

Author(s): Georg Rehm, Kalina Bontcheva, Khalid Choukri, Jan Hajič, Stelios Piperidis, Andrejs Vasiļjevs (Editors)
Published in: Proceedings of the 1st International Workshop on Language Technology Platforms, 2020

European Language Grid: An Overview

Author(s): Rehm, G., Berger, M., Elsholz, E., Hegele, S., Kintzel, F., Marheinecke, K., Piperidis, S., Deligiannis, M., Galanis, D., Gkirtzou, K., Labropoulou, P., Bontcheva, K., Jones, D., Roberts, I., Hajic, J., Hamrlová, J., Kačena, L., Choukri, K., Arranz, V., Vasiļjevs, A., Anvari, O., Lagzdiņš, A., Meļņika, J., Backfried, G., Dikici, E., Janosik, M., Prinz, K., Prinz, C., Stampler, S., Thomas-An
Published in: Proceedings of the 12th Language Resources and Evaluation Conference (LREC 2020), 2020, Page(s) 3366‑3380

The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe

Author(s): Georg Rehm, Katrin Marheinecke, Stefanie Hegele, Stelios Piperidis, Kalina Bontcheva, Jan Hajic, Khalid Choukri, Andrejs Vasiļjevs, Gerhard Backfried, Christoph Prinz, Jose Manuel Gomez-Perez, Luc Meertens, Paul Lukowicz, Josef van Genabith, Andrea Lösch, Philipp Slusallek, Morten Irgens, Patrick Gatellier, Joachim Köhler, Laure Le Bars, Dimitra Anastasiou, Albina Auksoriūtė, Núria Bel, Ant
Published in: Proceedings of the 12th Language Resources and Evaluation Conference, 2020, Page(s) 3322‑3332

Making Metadata Fit for Next Generation Language Technology Platforms: The Metadata Schema of the European Language Grid

Author(s): Labropoulou, P., Gkirtzou, K., Gavriilidou, M., Deligiannis, M., Galanis, D., Piperidis, S., Rehm, G., Berger, M., Mapelli, V., Rigault, M., Arranz, V., Choukri, K., Backfried, G., Perez, J. M. G., and Garcia-Silva, A.
Published in: Proceedings of the 12th Language Resources and Evaluation Conference (LREC 2020), 2020, Page(s) 3428‑3437

Speaker Adaptive Training Using Model Agnostic Meta-Learning

Author(s): Ondrej Klejch, Joachim Fainberg, Peter Bell, Steve Renals
Published in: 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2019, Page(s) 881-888
DOI: 10.1109/asru46091.2019.9003751

Look, Read and Enrich - Learning from Scientific Figures and their Captions

Author(s): Jose Manuel Gomez-Perez, Raul Ortega
Published in: Proceedings of the 10th International Conference on Knowledge Capture, 2019, Page(s) 101-108
DOI: 10.1145/3360901.3364420

Learning Embeddings from Scientific Corpora using Lexical, Grammatical and Semantic Information

Author(s): Andrés García-Silva, Ronald Denaux, José Manuél Gómez-Pérez
Published in: K-CAP '19: Proceedings of the 10th International Conference on Knowledge Capture, 2019

An Empirical Study on Pre-trained Embeddings and Language Models for Bot Detection

Author(s): Andres Garcia-Silva, Cristian Berrio, José Manuel Gómez-Pérez
Published in: Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019), 2019, Page(s) 148-155
DOI: 10.18653/v1/w19-4317

Character Mapping and Ad-hoc Adaptation: Edinburgh’s IWSLT 2020 Open Domain Translation System

Author(s): Pinzhen Chen, Nikolay Bogoychev, Ulrich Germann
Published in: Proceedings of the 17th International Conference on Spoken Language Translation, 2020, Page(s) 122-129
DOI: 10.18653/v1/2020.iwslt-1.14

Acoustic Model Adaptation from Raw Waveforms with Sincnet