Deliverables Demonstrators, pilots, prototypes (4) Initial COMPRISE SDK prototype First prototype integrating the research results of WP2 WP3 and T41 Final platform demonstrator and updated data protection and GDPR requirements Final platform demonstrator populated with additional data and trained models Updated version of D51 taking the latest research and legal advances into account Initial platform demonstrator Fully functional platform demonstrator implemented populated with a few initial data and trained models deployed in working environment ready for use in other WPs Final COMPRISE SDK prototype and documentation Final prototype integrating the research results of WP2 WP3 and T41 and Swagger online documentation Websites, patent fillings, videos etc. (3) Second dissemination and communication report Update on the actions conducted lessons learned on the innovation ecosystem proposed in COMPRISE and planned dissemination and communication after the end of the project Dissemination and communication action plan Summarises all planned dissemination actions and provides communication material to the partners (logo, graphical chart, public website, short presentation of the project, templates, poster, leaflet) First dissemination and communication report Update on the actions conducted and web document defining the COMPRISE knowledge repository on multilingual voice-enabled applications. Other (10) Final weakly supervised learning library Final design implementation and evaluation of weakly supervised learning for all considered tasks Baseline speech and text transformation and model learning library Design, implementation, and evaluation of baseline transformations focusing on deleting the user’s identity and words carrying critical information, and model learning. Improved transformation library and initial privacy guarantees Design, implementation, and evaluation of speech and text transformations addressing more types of private information and initial statistical utility/privacy bounds. Final personalised learning library Final design implementation and evaluation of model personalisation strategies for speechtotext spoken language understanding and dialog management Initial multilingual interaction library Software components and documentation for speech-to-speech translation and integration of dialog systems in the operating branch. Final transformation library and privacy guarantees Final design implementation and evaluation of speech and text transformations and final statistical utilityprivacy bounds Data collection and curation features of the platform Platform with data collection and curation features implemented and deployed in working environment Final multilingual interaction library Software components and documentation for speechtospeech translation and integration of dialog systems in both the operating and the training branch Initial weakly supervised learning library Design, implementation, and evaluation of weakly supervised learning for spoken language understanding. Initial personalised learning library for speech-to-text Design, implementation, and evaluation of initial model personalisation strategies for speech-to-text. Open Research Data Pilot (2) Initial data management plan Initial data management plan Final data management plan Final data management plan Documents, reports (4) Initial scientific evaluation First combined evaluation of baseline speech, dialog, and translation tools. Platform hardware and software architecture Platform specification including main requirements, hardware and software architecture SDK software architecture SDK specification including main requirements and software architecture Data protection and GDPR requirements Guidelines, procedures and recommendations how to implement personal data protection in the platform. Publications Conference proceedings (21) Transfer Learning and Distant Supervision for Multilingual Transformer Models: A Study on African Languages Author(s): Michael A. Hedderich; David Ifeoluwa Adelani; Dawei Zhu; Jesujoba O. Alabi; Udia Markus; Dietrich Klakow Published in: 2020 Conference on Empirical Methods in Natural Language Processing, Issue 16/11/2020, 2020 Publisher: ACL DOI: 10.18653/v1/2020.emnlp-main.204 Distant supervision and noisy label learning for low resource named entity recognition: A study on Hausa and Yorùbá Author(s): Adelani, David Ifeoluwa; Hedderich, Michael,; Zhu, Dawei; Van Den Berg, Esther; Klakow, Dietrich Published in: AfricaNLP / PML4DC Workshop 2020 @ICLR 2020, Issue 26/04/2020, 2020 Publisher: OpenReview.net Preventing author profiling through zero-shot multilingual back-translation Author(s): Adelani, David,; Zhang, Miaoran; Shen, Xiaoyu; Davody, Ali; Kleinbauer, Thomas; Klakow, Dietrich Published in: 2021 Conference on Empirical Methods in Natural Language Processing, Issue 07/11/2021, 2021 Publisher: ACL The effect of domain and diacritics in Yorùbá-English neural machine translation Author(s): Adelani, David,; Ruiter, Dana; Alabi, Jesujoba,; Adebonojo, Damilola; Ayeni, Adesina; Adeyemi, Mofetoluwa; Awokoya, Ayodele; Espana-Bonet, Cristina Published in: 18th Biennial Machine Translation Summit, Issue 16/08/2021, 2021 Publisher: AMTA Benchmarking and challenges in security and privacy for voice biometrics Author(s): Bonastre, Jean-Francois; Delgado, Hector; Evans, Nicholas; Kinnunen, Tomi; Lee, Kong Aik; Liu, Xuechen; Nautsch, Andreas; Noe, Paul-Gauthier; Patino, Jose; Sahidullah, Md; Srivastava, Brij Mohan Lal; Todisco, Massimiliano; Tomashenko, Natalia; Vincent, Emmanuel; Wang, Xin; Yamagishi, Junichi Published in: 1st ISCA Symposium on Security and Privacy in Speech Communication, Issue 10/11/2021, 2021 Publisher: ISCA Privacy-Preserving Adversarial Representation Learning in ASR: Reality or Illusion? Author(s): Srivastava, Brij Mohan Lal; Bellet, Aurélien; Tommasi, Marc; Vincent, Emmanuel Published in: INTERSPEECH 2019, Issue 15/09/2019, 2019, Page(s) 3700-3704 Publisher: ISCA Introducing the VoicePrivacy initiative Author(s): Tomashenko, Natalia; Srivastava, Brij Mohan Lal,; Wang, Xin; Vincent, Emmanuel; Nautsch, Andreas; Yamagishi, Junichi; Evans, Nicholas; Patino, Jose; Bonastre, J.-F; Noé, Paul-Gauthier; Todisco, Massimiliano Published in: INTERSPEECH 2020, Issue 25/10/2020, 2020 Publisher: ISCA Evaluating Voice Conversion-based Privacy Protection against Informed Attackers Author(s): Srivastava, Brij Mohan Lal; Vauquier, Nathalie; Sahidullah, Md; Bellet, Aurélien; Tommasi, Marc; Vincent, Emmanuel Published in: 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, Issue 04/05/2020, 2020, Page(s) 2802-2806 Publisher: IEEE Investigating the Impact of Pre-trained Word Embeddings on Memorization in Neural Networks Author(s): Thomas, Aleena; Adelani, David; Davody, Ali; Mogadala, Aditya; Klakow, Dietrich Published in: 23rd International Conference on Text, Speech and Dialogue, Issue 08/09/2020, 2020 Publisher: Springer Design Choices for X-vector Based Speaker Anonymization Author(s): Srivastava, Brij Mohan Lal; Tomashenko, Natalia; Wang, Xin; Vincent, Emmanuel; Yamagishi, Junichi; Maouche, Mohamed; Bellet, Aurélien; Tommasi, Marc Published in: INTERSPEECH 2020, Issue 25/10/2020, 2020 Publisher: ISCA The COMPRISE Cloud Platform Author(s): Skadiņš, Raivis; Salimbajevs, Askars Published in: 1st International Workshop on Language Technology Platforms, Issue 16/05/2020, 2020, Page(s) 108–111 Publisher: European Language Resources Association Using Privacy-Transformed Speech in the Automatic Speech Recognition Acoustic Model Training Author(s): Salimbajevs, Askars Published in: 9th International Conference on Human Language Technologies – the Baltic Perspective, Issue 22/09/2020, 2020 Publisher: IOS Press Assessing Unintended Memorization in Neural Discriminative Sequence Models Author(s): Helali, Mossad; Kleinbauer, Thomas; Klakow, Dietrich Published in: 23rd International Conference on Text, Speech and Dialogue, Issue 08/09/2020, 2020 Publisher: Springer Private Protocols for U-Statistics in the Local Model and Beyond Author(s): Bell, James; Bellet, Aurélien; Gascón, Adrià; Kulkarni, Tejas Published in: International Conference on Artificial Intelligence and Statistics, Issue 27/08/2020, 2020, Page(s) 1573-1583 Publisher: Proceedings of Machine Learning Research Data Augmentation for Pipeline-Based Speech Translation Author(s): Alves, Diego; Salimbajevs, Askars; Pinnis, Mārcis Published in: 9th International Conference on Human Language Technologies – the Baltic Perspective, Issue 22/09/2020, 2020 Publisher: IOS Press A Comparative Study of Speech Anonymization Metrics Author(s): Maouche, Mohamed; Srivastava, Brij Mohan Lal; Vauquier, Nathalie; Bellet, Aurélien; Tommasi, Marc; Vincent, Emmanuel Published in: INTERSPEECH 2020, Issue 25/10/2020, 2020 Publisher: ISCA On Semi-Supervised LF-MMI Training of Acoustic Models with Limited Data Author(s): Sheikh, Imran; Vincent, Emmanuel; Illina, Irina Published in: INTERSPEECH 2020, Issue 25/10/2020, 2020 Publisher: ISCA Achieving Multi-Accent ASR via Unsupervised Acoustic Model Adaptation Author(s): Turan, M. A. Tuğtekin; Vincent, Emmanuel; Jouvet, Denis Published in: INTERSPEECH 2020, Issue 25/10/2020, 2020 Publisher: ISCA Privacy Guarantees for De-identifying Text Transformations Author(s): Adelani, David Ifeoluwa; Davody, Ali; Kleinbauer, Thomas; Klakow, Dietrich Published in: INTERSPEECH 2020, Issue 25/10/2020, 2020 Publisher: ISCA Who Started this Rumor? Quantifying the Natural Differential Privacy Guarantees of Gossip Protocols Author(s): Bellet, Aurélien; Guerraoui, Rachid; Hendrikx, Hadrien Published in: 34th International Symposium on Distributed Computing, Issue 12/10/2020, 2020 Publisher: LIPIcs Fully Decentralized Joint Learning of Personalized Models and Collaboration Graphs Author(s): Zantedeschi, Valentina; Bellet, Aurélien; Tommasi, Marc Published in: 23rd International Conference on Artificial Intelligence and Statistics, Issue 26/08/2020, 2020 Publisher: PMLR Peer reviewed articles (3) MasakhaNER: Named Entity Recognition for African Languages Author(s): David Ifeoluwa Adelani; Jade Abbott; Graham Neubig; Daniel D’souza; Julia Kreutzer; Constantine Lignos; Chester Palen-Michel; Happy Buzaaba; Shruti Rijhwani; Sebastian Ruder; Stephen Mayhew; Israel Abebe Azime; Shamsuddeen H. Muhammad; Chris Chinenye Emezue; Joyce Nakatumba-Nabende; Perez Ogayo; Aremu Anuoluwapo; Catherine Gitau; Derguene Mbaye; Jesujoba Alabi; Seid Muhie Yimam; Tajuddeen Rabiu Published in: Transactions of the ACL, Issue 07/10/2021, 2021, ISSN 2307-387X Publisher: MIT Press DOI: 10.1162/tacl_a_00416 How can Private Information Recorded by Voice-Enabled Systems be Identified? Author(s): Moretón Poch, Álvaro; Jaramillo, Ariadna Published in: European Data Protection Law Review, Issue to appear, 2020, ISSN 2364-2831 Publisher: Lexxion Monolingual and cross-lingual intent detection without training data in target languages Author(s): Jurgita Kapočiūtė-Dzikienė; Askars Salimbajevs; Raivis Skadiņš Published in: Electronics, Issue 11/06/2021, 2021, ISSN 2079-9292 Publisher: MDPI DOI: 10.3390/electronics10121412 Searching for OpenAIRE data... There was an error trying to search data from OpenAIRE No results available