Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

Cross-Lingual Embeddings for Less-Represented Languages in European News Media

Project description

New tool transforms monolingual websites into other languages

The internet is a global system of interconnected computer networks. Yet, it’s mostly developed in English. In the EU, where multilingualism is one of the founding principles, websites and online services for citizens have developed national local language resources and only provide a second language (usually English) when needed. But this is not enough. New tools allowing high-quality transformations (not translations) between languages are needed. In this context, the EU-funded EMBEDDIA project will develop the technology (cross lingual embeddings coupled with deep neural networks) to allow existing monolingual resources to be used across languages without the need for large computational resources.

Objective

Access to the internet is no longer a luxury---it is a basic component of everyday life and civic engagement, but one in which language continues to be a challenge for fair and equitable access. As Europe becomes more multicultural, and personal and professional mobility between cultures rapidly increases, access to fundamental resources such as local news and government services is limited by the great diversity of the EU's 37 languages. The internet mostly developed in English, and without clear planning for how language issues might form barriers to access and engagement, nor how multilingualism might be supported. In the EU, websites and online services for citizens have developed national local language resources, and often only provide a second language (usually English) when absolutely needed; but the great proliferation of web content, multiple and fast-changing content streams, and an expanding user interest base make this approach untenable. And while advanced natural language research and resources exist for a few dominant languages (English, French, German), many of Europe's smaller language communities---and the news media industry that serves them---lack appropriate tools for multilingual internet development. For the EU to realise a truly equitable, open, multilingual future internet, new tools allowing high quality transformations (not translations) between languages are urgently needed. The EMBEDDIA project seeks to address these challenges by leveraging innovations in the use of cross-lingual embeddings coupled with deep neural networks to allow existing monolingual resources to be used across languages, leveraging their high speed of operation for near real-time applications, without the need for large computational resources. Across three years, the project's six academic and four industry partners will develop novel solutions including for under-represented languages, and test them in real-world news and media production contexts.

Fields of science (EuroSciVoc)

CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

You need to log in or register to use this function

Programme(s)

Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s)

Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Funding Scheme

Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

RIA - Research and Innovation action

See all projects funded under this funding scheme

Call for proposal

Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

(opens in new window) H2020-ICT-2018-20

See all projects funded under this call

Coordinator

INSTITUT JOZEF STEFAN
Net EU contribution

Net EU financial contribution. The sum of money that the participant receives, deducted by the EU contribution to its linked third party. It considers the distribution of the EU financial contribution between direct beneficiaries of the project and other types of participants, like third-party participants.

€ 560 059,54
Address
Jamova 39
1000 Ljubljana
Slovenia

See on map

Region
Slovenija Zahodna Slovenija Osrednjeslovenska
Activity type
Research Organisations
Links
Total cost

The total costs incurred by this organisation to participate in the project, including direct and indirect costs. This amount is a subset of the overall project budget.

€ 560 059,54

Participants (10)

My booklet 0 0