With respect to the privacy objective, we released two software tools which protect the voice of the users and their personal information: the COMPRISE Voice Transformer and the COMPRISE Text Transformer. The COMPRISE Voice Transformer aims to prevent biometric identification of the users by converting their voice to another random person’s voice. It offers the same level of privacy protection among 50 speakers as original speech among 20,000 speakers, as validated through state-of-the-art biometric protocols. The COMPRISE Text Transformer aims to identify potentially privacy-threatening words or phrases in a piece of text and to replace them by harmless alternatives preserving the text’s structure. The main innovation lies in our word and phrase replacement strategy which offers formal privacy guarantees.
With respect to the inclusiveness objective, the COMPRISE Speech-to-Text Translation tool can translate spoken language in a way that is robust to STT errors and disfluencies (e.g. hesitations, missing words). We also introduced a multilingual NLU system, which addresses the detection of user intent in any language without any training resources in that language, and an STT personalisation method, which improves STT performance by 27% relative for users with regional or foreign accents with only 1 h of untranscribed training data per accent.
With respect to the cost-effectiveness objective, COMPRISE Weakly Supervised STT reduces the amount of human annotated data needed to train STT systems by more than 40%, while COMPRISE Weakly Supervised NLU benefits from as low as 100 labeled training examples and scales seamlessly down to a zero-shot setting, requiring no training at all. All these innovative software tools leverage cutting-edge deep learning and speech and language processing approaches and new approaches developed within COMPRISE.
Existing and new software tools have been integrated into an SDK interoperating with a Cloud Platform, which provide a full-fledged open-source solution for voice technology companies and application developers. The COMPRISE SDK includes the COMPRISE Client Library, which can be deployed on any Android or iOS device and integrates all required voice functionalities, the COMPRISE App Wizard, which allows quick configuration of these functionalities, and the COMPRISE Personal Server, which runs computationally demanding services outside the device while still preserving privacy. The COMPRISE Cloud Platform provides services for data collection and curation and for system training.
We have also developed six demonstrators to showcase these innovative tools: Cookbook, Notes, Remote Presentation Control, Shoplay, Hospital Concierge; and Doctor’s Assistant. The integration of voice features in the Remote Presentation Control demonstrator took 2 PMs with COMPRISE vs. 7 PMs without it, which translates into cost savings above 70%. These demonstrators were evaluated by potential end-users, who appreciated the new user experience offered by voice features and rated the demonstrators positively. This validates the benefits of COMPRISE, especially in the sectors of smart consumer apps, e-commerce, and e-health.
All of these advances have been thoroughly followed and monitored via rigorous management tasks, via a thorough comprehensive summary and analysis of the main aspects regarding the General Data Protection Regulation (GDPR) that needs to be considered for the implementation of the project and the development of COMPRISE, and via efficient dissemination, communication and exploitation-related activities.