Creation, Reuse, Normalisation and Integration of Terminologies in Natural Language Processing Systems

Objective

Transterm addresses the problems of enriching terminologies and integrating them into the application dictionaries of NLP systems. It also deals with automatic and semi-automatic construction of application terminologies from corpora.The main objective is to facilitate the use of terminological data in NLP systems thus tackling the critical issue of real site customisation of this type of software. Two classes of users are foreseen: application developers and terminology builders/administrators.

There are three major lines of action:

The elaboration of a standardised generic representation of terminological data enriched with linguistic information, and application specific knowledge derived from terminological resources.
The implementation of a modular portable toolbox allowing a) the assembly and customisation of terminological resources in order to characterize and enrich these resources, check their coherence and merge them with lexical data to create machine-processable lexico-terminological objects and b) semi-automatic terminology extraction from text.
The validation of the tools, methods and formats developed within the project by means of three real site tests involving corporate data and two smaller-scale experiments covering altogether five languages (French, Italian, English, Greek and Portuguese).

Approach and Methodology

The project is based on methods and tools already existing within the consortium, or under development. Results from related EC sponsored projects and from the EUREKA projects GRAAL and GENELEX will be used. It is complementary to GRAAL and GENELEX, which deal with the generic grammatical and lexical components of NLP systems.

The TRANSTERM toolbox will also take into account the known document description means (such as SGML) in order to facilitate both the acquisition and reuse of terminological data. Existing international norms in the field of terminology will be taken into account and links will be established with ongoing standardisation efforts in this field (like LISA TIF) and neighbouring areas (eg. the Knowledge Interchange Format). The software will be developed on a UNIX platform considering emerging standards such as OSF/Motif.

Exploitation and Future Prospects

The project is very much user driven. The industrial consortium members expect to improve the productivity of their applications, especially in the area of automatic indexing. The software toolbox will allow the construction of application specific disambiguation heuristics and descriptions of transformations of identified grammatical constructs into objects conforming to the characteristics of a terminology.

Semi-automatic construction of terminological resources in languages such as Greek and Portuguese will be supported by providing tools usable in these environments. .SP 1 TRANSTERM is expected to lead to pre-industrial prototypes which lend themselves to rapid exploitation by industrial system developers leading to marketable products. Associated services will become more cost-effective. The results of work on standardisation will be made available to the scientific and industrial communities.

The close cooperation of TRANSTERM with the related Eureka projects GRAAL and GENELEX will have a synergetic effect on Community sponsered efforts in Natural Language Processing.

Fields of science (EuroSciVoc)

CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

Programme(s)

Multi-annual funding programmes that define the EU’s priorities for research and innovation.

FP3-LRE - Specific programme of research and technological development (EEC) in the field of telematic systems in areas of general interest - Linguistic research and engineering -, 1990-1994

Topic(s)

Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Data not available

Call for proposal

Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

Data not available

Funding Scheme

Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

Data not available

Coordinator

GSI-ERLI

EU contribution

No data

Address

1 place des Marseillais
94227 Charenton
France

Total cost

No data

Participants (8)

Aérospatiale Société Nationale Industrielle SA

France

EU contribution

No data

Address

Total cost

No data

CENTRO RICERCHE FIAT S.C.P.A.

Italy

EU contribution

No data

Address

Strada Torino 50
10043 ORBASSANO

Total cost

No data

ILTEC

Portugal

EU contribution

No data

Address

Total cost

No data

INSTITUTE FOR LANGUAGE AND SPEECH PROCESSING

Greece

EU contribution

No data

Address

Total cost

No data

ISSCO

Switzerland

EU contribution

No data

Address

Total cost

No data

ISTITUTO TRENTINO DI CULTURA

Italy

EU contribution

No data

Address

VIA SANTA CROCE 77
38100 TRENTO

Total cost

No data

LINGSOFT

Finland

EU contribution

No data

Address

Total cost

No data

Électricité de France (EDF)

France

EU contribution

No data

Address

Total cost

No data

Objective

Fields of science (EuroSciVoc) CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

Programme(s) Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s) Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Call for proposal Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

Funding Scheme Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

Coordinator

Participants (8)

Download Download the content of the page

Fields of science (EuroSciVoc)

CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

Programme(s)

Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s)

Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Call for proposal

Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

Funding Scheme

Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.