European Commission logo
español español
CORDIS - Resultados de investigaciones de la UE
CORDIS

Cross-Lingual Embeddings for Less-Represented Languages in European News Media

Descripción del proyecto

Una nueva herramienta transforma los sitios web monolingües en otros idiomas

Internet es un sistema global de redes informáticas interconectadas. Sin embargo, se desarrolla principalmente en inglés. En la Unión Europea, donde el multilingüismo es uno de los principios básicos, los sitios web y los servicios en línea para los ciudadanos han desarrollado recursos lingüísticos locales nacionales y solo ofrecen una segunda lengua (normalmente el inglés) cuando es necesario. Sin embargo, esto no es suficiente. Se necesitan nuevas herramientas que permitan realizar transformaciones (no traducciones) de alta calidad entre idiomas. En este contexto, el equipo del proyecto financiado con fondos europeos EMBEDDIA desarrollará la tecnología (inclusiones interlingüísticas acopladas a redes neuronales profundas) que permitirá utilizar los recursos monolingües existentes en todos los idiomas sin necesidad de grandes recursos computacionales.

Objetivo

Access to the internet is no longer a luxury---it is a basic component of everyday life and civic engagement, but one in which language continues to be a challenge for fair and equitable access. As Europe becomes more multicultural, and personal and professional mobility between cultures rapidly increases, access to fundamental resources such as local news and government services is limited by the great diversity of the EU's 37 languages. The internet mostly developed in English, and without clear planning for how language issues might form barriers to access and engagement, nor how multilingualism might be supported. In the EU, websites and online services for citizens have developed national local language resources, and often only provide a second language (usually English) when absolutely needed; but the great proliferation of web content, multiple and fast-changing content streams, and an expanding user interest base make this approach untenable. And while advanced natural language research and resources exist for a few dominant languages (English, French, German), many of Europe's smaller language communities---and the news media industry that serves them---lack appropriate tools for multilingual internet development. For the EU to realise a truly equitable, open, multilingual future internet, new tools allowing high quality transformations (not translations) between languages are urgently needed. The EMBEDDIA project seeks to address these challenges by leveraging innovations in the use of cross-lingual embeddings coupled with deep neural networks to allow existing monolingual resources to be used across languages, leveraging their high speed of operation for near real-time applications, without the need for large computational resources. Across three years, the project's six academic and four industry partners will develop novel solutions including for under-represented languages, and test them in real-world news and media production contexts.

Convocatoria de propuestas

H2020-ICT-2018-20

Consulte otros proyectos de esta convocatoria

Convocatoria de subcontratación

H2020-ICT-2018-2

Régimen de financiación

RIA - Research and Innovation action

Coordinador

INSTITUT JOZEF STEFAN
Aportación neta de la UEn
€ 560 059,54
Dirección
Jamova 39
1000 Ljubljana
Eslovenia

Ver en el mapa

Región
Slovenija Zahodna Slovenija Osrednjeslovenska
Tipo de actividad
Research Organisations
Enlaces
Coste total
€ 560 059,54

Participantes (10)