Back to overview
PANACEA - Platform for Automatic, Normalized Annotation and Cost-Effective Acquisition of Language Resources for Human Language Technologies
24864 - STREP
|
|
|
ICT-2007.2.2 - Cognitive Systems, Interaction, Robotics
|
A strategic challenge for Europe in today's globalised economy is to overcome language barriers through technological means. In particular, Machine Translation systems are expected to have a significant impact on the management of multilingualism in Europe, making it possible to translate the huge quantity of (written or oral) data produced, and thus, covering the needs of hundreds of millions of citizens. PANACEA is addressing the most critical aspect for Machine Translation: the so-called resources bottleneck. Language Technologies depend on the availability of language-dependent knowledge for the real-life implementation, i.e. they require Language Resources (LRs). In addition, LRs for a given language can never be considered complete nor final because of the characteristics of natural language: language change and the emergence of new knowledge domains and new language varieties.
Challenge
This constant need of LRs supply can only be satisfied with an automatic, dynamic and adaptive system for compiling, producing and validating LRs, a system conceived as an integrated machinery for the production of LRs.
Goal
The objective of PANACEA is to build a factory of LRs that progressively automates the stages involved in the acquisition, production, updaging and maintenance of LRs required by MT systems, among other Language Technology applications and in the time required. This automation will cut down the cost, time and human effort significantly.
Scientific Innovation
PANACEA will employ novel methods to automatically learn lexical and grammatical knowledge from large amounts of texts, and present that knowledge in a form readily exploitable by automatic translation systems and other Language Technology applications. The modules performing these tasks will be integrated in a web-based virtual "factory" which forms a standardized and language-independent scalable backbone for the functionalities to be plugged in.
The result
A virtual "factory" that can produce desired language resources with the minimal expected quality and coverage, with no or limited human intervention. The "factory" will be validated by producing a pre-defined set of language resources and evaluating their quality.
Impact
If successful, the concept of highly automated production of high-quality language resources will revolutionize many fields of language technology. In particular, it will improve availability and quality of machine translation systems for all languages that have sufficient volumes and types of corpora available.
Where will the project be present?
The results and future plans of PANACEA will be regularly presented in major language technology conferences (LREC, ACL, COLING, STATMT, MT Summit). PANACEA will organise two targeted workshops, one on scientific issues and another one on technology transfer.
| Co-ordinator |
Contact Person: Name: Nuría Bel Tel: +34 935 422 307 Fax: +34 935 422 321 E-mail: nuria.bel@upf.edu Organisation: Universitat Pompeu Fabra, Spain |
| Participants |
|
Universitat Pompeu Fabra, Spain |
This page is maintained by: Susan Fraser
