This site has been archived on
The Community Research and Development Information Service - CORDIS
Information & Communication Technologies

Language Technologies

back to overview

Project factsheets will no longer be updated. All information relevant to the project can be found on the CORDIS factsheet . This is updated on a regular basis with public deliverables, etc.

MultilingualWeb-LT (LT-Web) - Language Technology in the Web

287815 - CSA


At a glance

FP7-ICT-2011-7 - Language technologies

  • Duration: 24 months
  • Start date: 1 January 2012
  • End date: 31 December 2013
  • Project officer: Kimmo Rossi
  • website


One of the main obstacles that currently prevent effective integration of language technologies (like machine translation) into mainstream web technologies is that there is a lack of standardised "information about the content" (metadata). In other words, language technologies are applied "blindly" without clear background knowledge about the content that is being processed, and without leaving a "trace" about the processing steps already performed on the content. Because of the lack of a commonly accepted systematic approach to describe the content and the processing steps, a lot of valuable information is lost. The objective of MultilingualWeb-LT is to demonstrate how such metadata can be encoded, maintained and exploited in various processes (e.g. localization workflows, machine translation, content management systems like okapi and drupal). Rather than taking a theoretic and institutionalized standardisation approach, MultilingualWeb-LT intends to set up reference implementations with real-life systems and real users, where the value of metadata is concretely demonstrated. The resulting conventions and findings will be documented and endorsed as W3C standards, with the necessary documentation, data and test suites, as required by W3C standardization policy.

Objective and Innovation

This is the first broad and large-scale attempt to use practical reference implementations as a vehicle to establish best practices and standards for linguistic content processing. It builds on the previous standardisation work of the partners, and the everyday core business scenarios of the industrial partners representing language service providers, web technologies and software/IT industry.

Target group of the project

First-line target groups are companies and administrations that produce, process (e.g. translate) or publish online textual content in large scale. Examples: language service providers (LSPs), publishers, webmasters, translation agencies, language technology providers, web technology providers (especially content management system providers).

The result

The concrete results are following:
- open workshops illustrating and disseminating the best practices;
- documented W3C standards on linguistic metadata;
- tools and modules that add MultilingualWeb-LT metadata support to open-source and commercial applications.


Better integration of language technologies into web technologies, content management systems and localisation workflows, leading to better quality of translations, better language coverage (especially for less widely spoken languages) and significant cost savings.


Contact Person:
Name: Dr Felix Sasaki
E-mail: Felix Sasaki






back to overview


This page is maintained by: Susan Fraser (email removed)