Skip to main content

Stream Learning for Multilingual Knowledge Transfer

Deliverables

Initial progress report on continuous massive stream learning

Report describing the technologies developed so far as well as experimental results showing the quality of the respective methods

Initial Prototype Report

Initial report on the two usecase prototypes UI wireframes requirements and userstories

Quality Assurance and Risk Assessment Plan

Report that incorporates details on the quality assurance processes adopted within SELMA

Use Case Description and Requirements

Complete description of the primary use cases of SELMA and personae user stories and requirements

Initial progress report on speech and natural language processing

Report describing the language processing technologies developed in the WP and experimental results including first experiments on multilingual and user feedback transfer learning and distributed learning

Impact Plan

Report describing the consortiums intentions regarding dissemination exploitation and communication

Platform architecture and API documentation

Document describing the extended platform architecture for integrating the natural language processing components collection and storage of editorial corrections and integration of the continuous massive stream learning functionality

Evaluation Plan

Report describing the technical as well as the enduser testing scenariosto be carried out

Interim Periodic Progress Report

Report covering all administrative and financial details for the first 18 months

Initial Data Management Plan

Report explaining how data in the platform is managed also addressing the issues of data protection and access rights

Searching for OpenAIRE data...

Publications

ON-TRAC Consortium Systems for the IWSLT 2022 Dialect and Low-resource Speech Translation Tasks

Author(s): Marcely Zanon Boito, John Ortega, Hugo Riguidel, Antoine Laurent, Loïc Barrault, Fethi Bougares, Firas Chaabani, Ha Nguyen, Florentin Barbier, Souhir Gahbiche, Yannick Estève
Published in: IWSLT 2022, 2022
Publisher: IWSLT 2022

Task Agnostic and Task Specific Self-Supervise Learning from Speech with LeBenchmark

Author(s): Solène Evain, Ha Nguyen, Hang Le, Marcely Zanon Boito, Salima Mdhaffar, Sina Alisamir, Ziyi Tong, Natalia Tomashenko, Marco Dinarelli, Titouan Parcollet, Alexandre Allauzen, Yannick Estève, Benjamin Lecouteux, François Portet, Solange Rossato, Fabien Ringeval, Didier Schwab, Laurent Besacier
Published in: Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, 2021, Page(s) 10ff
Publisher: NeurIPS

The Spoken Language Understanding Media Benchmark Dataset in the Era of Deep Learning: data updates, training and evaluation tools

Author(s): Gaëlle Laperrière, Valentin Pelloin, Antoine Caubrière, Salima Mdhaffar, Sahar Ghannay, Bassam Jabaian, Nathalie Camelin, Yannick Estève
Published in: 13rd Language Resources and Evaluation Conference (LREC), 2022
Publisher: LREC

LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech

Author(s): Solène Evain, Ha Nguyen, Hang Le, Marcely Zanon Boito, Salima Mdhaffar, Sina Alisamir, Ziyi Tong, Natalia Tomashenko, Marco Dinarelli, Titouan Parcollet, Alexandre Allauzen, Yannick Estève, Benjamin Lecouteux, François Portet, Solange Rossato, Fabien Ringeval, Didier Schwab, Laurent Besacier
Published in: Proc. Interspeech 2021, 1439-1443, doi: 10.21437/Interspeech.2021-556, 2021
Publisher: Interspeech 2021

Modèles neuronaux pré-appris par auto-supervision sur des enregistrements de parole en français

Author(s): Solène Evain, Ha Nguyen, Hang Le, Marcely Zanon Boito, Salima Mdhaffar, Sina Alisamir, Ziyi Tong, Natalia Tomashenko, Marco Dinarelli, Titouan Parcollet, Alexandre Allauzen, Yannick Estève, Benjamin Lecouteux, François Portet, Solange Rossato, Fabien Ringeval, Didier Schwab and Laurent Besacier
Published in: Journées d'Études sur la Parole - JEP2022, 2022
Publisher: JEP 2022

Speech Resources in the Tamasheq Language

Author(s): Marcely Zanon Boito, Fethi Bougares, Florentin Barbier, Souhir Gahbiche, Loïc Barrault, Mickael Rouvier, Yannick Estève
Published in: 13rd Language Resources and Evaluation Conference (LREC), 2022
Publisher: LREC

Priberam Labs at the 3rd Shared Task on SlavNER

Author(s): Pedro Ferreira, Rúben Cardoso, Afonso Mendes
Published in: Proceedings of the 8th Workshop on Balto-Slavic Natural Language Processing, 2021, Page(s) 86-92
Publisher: 8th Workshop on Balto-Slavic Natural Language Processing

LeBenchmark, un référentiel d’évaluation pour le français oral

Author(s): Hang Le, Sina Alisamir, Marco Dinarelli, Fabien Ringeval, Solène Evain, Ha Nguyen, Marcely Zanon Boito, Salima Mdhaffar, Ziyi Tong, Natalia Tomashenko, Titouan Parcollet, Alexandre Allauzen, Yannick Estève, Benjamin Lecouteux, François Portet, Solange Rossato, Didier Schwab and Laurent Besacier
Published in: Journées d'Études sur la Parole - JEP2022, 2022
Publisher: JEP 2022

Impact Analysis of the Use of Speech and Language Models Pretrained by Self-Supersivion for Spoken Language Understanding

Author(s): Salima Mdhaffar, Valentin Pelloin, Antoine Caubrière, Gaëlle Laperrière, Sahar Ghannay, Bassam Jabaian, Nathalie Camelin, Yannick Estève
Published in: 13rd Language Resources and Evaluation Conference (LREC), 2022
Publisher: LREC

Le benchmark MEDIA revisité : données, outils et évaluation dans un contexte d’apprentissage profond

Author(s): Gaëlle Laperrière, Valentin Pelloin, Antoine Caubrière, Salima Mdhaffar, Nathalie Camelin, Sahar Ghannay, Bassam Jabaian, Yannick Estève
Published in: Journées d'Études sur la Parole - JEP2022, 2022
Publisher: JEP

Where are we in semantic concept extraction for Spoken Language Understanding?

Author(s): Sahar Ghannay, Antoine Caubrière, Salima Mdhaffar, Gaëlle Laperrière, Bassam Jabaian, Yannick Estève
Published in: Speech and Computer: 23rd International Conference, SPECOM 2021, St. Petersburg, Russia, September 27–30, 2021, Proceedings, 2021
Publisher: SPECOM 2021