This site has been archived on
The Community Research and Development Information Service - CORDIS
Information & Communication Technologies

Language Technologies


Back to overview

Project factsheets will no longer be updated.  All information relevant to the project can be found on the CORDIS factsheet .  This is updated on a regular basis with public deliverables, etc.

PANACEA - Platform for Automatic, Normalized Annotation and Cost-Effective Acquisition of Language Resources for Human Language Technologies

24864 - STREP


At a glance

ICT-2007.2.2 - Cognitive Systems, Interaction, Robotics


A strategic challenge for Europe in today's globalised economy is to overcome language barriers through technological means. In particular, Machine Translation systems are expected to have a significant impact on the management of multilingualism in Europe, making it possible to translate the huge quantity of (written or oral) data produced, and thus, covering the needs of hundreds of millions of citizens. PANACEA is addressing the most critical aspect for Machine Translation: the so-called resources bottleneck. Language Technologies depend on the availability of language-dependent knowledge for the real-life implementation, i.e. they require Language Resources (LRs). In addition, LRs for a given language can never be considered complete nor final because of the characteristics of natural language: language change and the emergence of new knowledge domains and new language varieties.


This constant need of LRs supply can only be satisfied with an automatic, dynamic and adaptive system for compiling, producing and validating LRs, a system conceived as an integrated machinery for the production of LRs.


The objective of PANACEA is to build a factory of LRs that progressively automates the stages involved in the acquisition, production, updaging and maintenance of LRs required by MT systems, among other Language Technology applications and in the time required. This automation will cut down the cost, time and human effort significantly.

Scientific Innovation

PANACEA will employ novel methods to automatically learn lexical and grammatical knowledge from large amounts of texts, and present that knowledge in a form readily exploitable by automatic translation systems and other Language Technology applications. The modules performing these tasks will be integrated in a web-based virtual "factory" which forms a standardized and language-independent scalable backbone for the functionalities to be plugged in.

The result

A virtual "factory" that can produce desired language resources with the minimal expected quality and coverage, with no or limited human intervention. The "factory" will be validated by producing a pre-defined set of language resources and evaluating their quality.


If successful, the concept of highly automated production of high-quality language resources will revolutionize many fields of language technology. In particular, it will improve availability and quality of machine translation systems for all languages that have sufficient volumes and types of corpora available.

Where will the project be present?

The results and future plans of PANACEA will be regularly presented in major language technology conferences (LREC, ACL, COLING, STATMT, MT Summit). PANACEA will organise two targeted workshops, one on scientific issues and another one on technology transfer.


Contact Person:

Name: Nuría Bel

Tel: +34 935 422 307

Fax: +34 935 422 321


Organisation: Universitat Pompeu Fabra, Spain


Universitat Pompeu Fabra , Spain
Consiglio Nazionale delle Ricerche , Italy
Athena Research and Innovation Centre , Greece
University of Cambridge , UK
Linguatech , Germany
Dublin City University , Ireland
Evaluations and Language Resources Distribution Agency , France

Back to overview


This page is maintained by: Susan Fraser (email removed)