Skip to main content
European Commission logo
English English
CORDIS - EU research results
CORDIS

Enabling Multilingual Conversational AI

Project description

Training speech recognition algorithms to speak more languages

Say hello to Apple’s Siri, Amazon’s Echo and Google’s Assistant. But in which language? These task-based statistical dialogue systems (SDSs) are not available in all languages. This limits the global reach of conversational artificial intelligence (AI). The EU-funded MultiConvAI project will develop the first prototype system for scaling conversational AI to multiple languages. Based on new methodology that learns multilingual word representations, this new system will use a process called semantic specialisation. The project will develop Natural Language Understanding (NLU) modules for SDSs via more effective semantic specialisation based on joint multi-source, multi-target training. It will also focus on typologically diverse languages.

Objective

In recent past, Conversational Artificial Intelligence (AI) has made major advances, thanks to the availability of big data and increasingly powerful deep learning. Task-based statistical dialogue systems (SDS) are now viable, embedded in popular commercial applications (e.g. the Apple’s Siri, Amazon’s Echo, Google’s Assistant) and cost-effective in many scenarios (e.g. customer support, call centre service, searching, booking). Yet current SDSs are only available for a handful of resource-rich languages, leaving the majority of the worlds languages and their speakers behind. Our project will develop the first prototype system for scaling conversational AI to multiple languages. This will be based on new methodology that learns multilingual word representations (i.e. embeddings, WEs) without the need for expensive training data, using a process called semantic specialisation that complements WEs with common-sense and linguistic knowledge in external knowledge graphs. Building on our promising pilot studies, we will develop Natural Language Understanding (NLU) modules for SDS via 1) more effective semantic specialisation based on joint multi-source multi-target training; and 2) focus on typologicallydiverse languages. We foresee a pioneering use of selective sharing and structural adaptation for obtaining WEs and optimisation for the target languages guided by typological knowledge. The best resulting technology will be integrated in a demo prototype system which users and industries can deploy to generate multilingual NLU input for more widely portable SDS. Since we also plan to explore the possibility to form a start-up company, we will use the system to demonstrate the potential to our network of industry contacts and potential customers. On a larger scale, extending the multilingual scope of SDSs can have major socioeconomic benefits: it can broaden the global reach of conversational AI and it can enhance its commercial viability.

Host institution

THE CHANCELLOR MASTERS AND SCHOLARS OF THE UNIVERSITY OF CAMBRIDGE
Net EU contribution
€ 150 000,00
Address
TRINITY LANE THE OLD SCHOOLS
CB2 1TN Cambridge
United Kingdom

See on map

Region
East of England East Anglia Cambridgeshire CC
Activity type
Higher or Secondary Education Establishments
Links
Total cost
No data

Beneficiaries (1)