This site has been archived on
The Community Research and Development Information Service - CORDIS
Information & Communication Technologies

Language Technologies

back to overview

Project factsheets will no longer be updated. All information relevant to the project can be found on the CORDIS factsheet. This is updated on a regular basis with public deliverables, etc.

X-Like - Cross-lingual Knowledge Extraction





At a glance

FP7-ICT-2011-7 - Language technologies

288342 - STREP


The goal of the X-LIKE (Cross-Lingual Knowledge Extraction) project is to develop innovative technology to monitor and aggregate knowledge from global, multilingual news sources (both mainstream and social media). By combining computational linguistics, machine learning, text mining and semantic technologies, the project will address two key open research problems:
- extracting and integrating formal knowledge from multilingual texts with cross-lingual knowledge bases;
- adapting linguistic techniques to deal with irregularities in informal language used primarily in social media;
The developed technology will be portable to other languages, while the project will specifically address English, German, Spanish, Chinese, Catalan and Slovenian.

Objective and Innovation

The main goal for the project X-LIKE is to combine scientific insights from several research areas to contribute in the area of text understanding. The project will advance the state-of-the-art in several ways, in particular in the area of making sense of massive volumes of informal online content in different languages. Combining multiple languages into one single knowledge representation will unlock a vast amount of data and can be leveraged for advanced analytics and complex event processing.

Target group pf the project

Tools resulting from the project will enable cross-lingual services for publishers, media monitoring and business intelligence and will be tested on two case studies within the project. The first use case is focused on fast and accurate delivery of financial and business news in English. The second one addresses the general news domain in English and Slovenian.

The result

The key tangible result of the project will be an “X-LIKE Software Toolkit”, having as a core functionality transformation of text from the target languages into logic based interlingua. This will further allow operations like cross-lingual search, question answering, exploratory analysis, visualization and identifications of trends and complex events through time.
A client/server based software package offering complete project functionality and with appropriate software module attachments (documentation etc.) will be publicly available on the project website.


X-LIKE’s research outcomes will enable people to better access online content and services across languages, hence contribute to the vision of a single community and ultimately of an integrated market across Europe. X-LIKE technologies and methods will allow for the creation of new distribution channels and re-use of contents across linguistic barriers, thus making content production potentially cheaper as the same digital resource can be reused and monetised in different markets. The project results will help to facilitate an exchange and flow of ideas across linguistic and geographic borders.


Contact Person:

Tel: +386 1 4773513







back to overview

This page is maintained by: Susan Fraser (email removed)