Skip to main content
Aller à la page d’accueil de la Commission européenne (s’ouvre dans une nouvelle fenêtre)
français français
CORDIS - Résultats de la recherche de l’UE
CORDIS
Contenu archivé le 2024-04-19

Creation, Reuse, Normalisation and Integration of Terminologies in Natural Language Processing Systems

Objectif

Transterm addresses the problems of enriching terminologies and integrating them into the application dictionaries of NLP systems. It also deals with automatic and semi-automatic construction of application terminologies from corpora.The main objective is to facilitate the use of terminological data in NLP systems thus tackling the critical issue of real site customisation of this type of software. Two classes of users are foreseen: application developers and terminology builders/administrators.

There are three major lines of action:

The elaboration of a standardised generic representation of terminological data enriched with linguistic information, and application specific knowledge derived from terminological resources.
The implementation of a modular portable toolbox allowing a) the assembly and customisation of terminological resources in order to characterize and enrich these resources, check their coherence and merge them with lexical data to create machine-processable lexico-terminological objects and b) semi-automatic terminology extraction from text.
The validation of the tools, methods and formats developed within the project by means of three real site tests involving corporate data and two smaller-scale experiments covering altogether five languages (French, Italian, English, Greek and Portuguese).

Approach and Methodology

The project is based on methods and tools already existing within the consortium, or under development. Results from related EC sponsored projects and from the EUREKA projects GRAAL and GENELEX will be used. It is complementary to GRAAL and GENELEX, which deal with the generic grammatical and lexical components of NLP systems.

The TRANSTERM toolbox will also take into account the known document description means (such as SGML) in order to facilitate both the acquisition and reuse of terminological data. Existing international norms in the field of terminology will be taken into account and links will be established with ongoing standardisation efforts in this field (like LISA TIF) and neighbouring areas (eg. the Knowledge Interchange Format). The software will be developed on a UNIX platform considering emerging standards such as OSF/Motif.

Exploitation and Future Prospects

The project is very much user driven. The industrial consortium members expect to improve the productivity of their applications, especially in the area of automatic indexing. The software toolbox will allow the construction of application specific disambiguation heuristics and descriptions of transformations of identified grammatical constructs into objects conforming to the characteristics of a terminology.

Semi-automatic construction of terminological resources in languages such as Greek and Portuguese will be supported by providing tools usable in these environments. .SP 1 TRANSTERM is expected to lead to pre-industrial prototypes which lend themselves to rapid exploitation by industrial system developers leading to marketable products. Associated services will become more cost-effective. The results of work on standardisation will be made available to the scientific and industrial communities.

The close cooperation of TRANSTERM with the related Eureka projects GRAAL and GENELEX will have a synergetic effect on Community sponsered efforts in Natural Language Processing.

Champ scientifique (EuroSciVoc)

CORDIS classe les projets avec EuroSciVoc, une taxonomie multilingue des domaines scientifiques, grâce à un processus semi-automatique basé sur des techniques TLN. Voir: Le vocabulaire scientifique européen.

Vous devez vous identifier ou vous inscrire pour utiliser cette fonction

Programme(s)

Programmes de financement pluriannuels qui définissent les priorités de l’UE en matière de recherche et d’innovation.

Thème(s)

Les appels à propositions sont divisés en thèmes. Un thème définit un sujet ou un domaine spécifique dans le cadre duquel les candidats peuvent soumettre des propositions. La description d’un thème comprend sa portée spécifique et l’impact attendu du projet financé.

Données non disponibles

Appel à propositions

Procédure par laquelle les candidats sont invités à soumettre des propositions de projet en vue de bénéficier d’un financement de l’UE.

Données non disponibles

Régime de financement

Régime de financement (ou «type d’action») à l’intérieur d’un programme présentant des caractéristiques communes. Le régime de financement précise le champ d’application de ce qui est financé, le taux de remboursement, les critères d’évaluation spécifiques pour bénéficier du financement et les formes simplifiées de couverture des coûts, telles que les montants forfaitaires.

Données non disponibles

Coordinateur

GSI-ERLI
Contribution de l’UE
Aucune donnée
Adresse
1 place des Marseillais
94227 Charenton
France

Voir sur la carte

Coût total

Les coûts totaux encourus par l’organisation concernée pour participer au projet, y compris les coûts directs et indirects. Ce montant est un sous-ensemble du budget global du projet.

Aucune donnée

Participants (8)

Mon livret 0 0