Skip to main content
Vai all'homepage della Commissione europea (si apre in una nuova finestra)
italiano italiano
CORDIS - Risultati della ricerca dell’UE
CORDIS
Contenuto archiviato il 2024-04-19

Creation, Reuse, Normalisation and Integration of Terminologies in Natural Language Processing Systems

Obiettivo

Transterm addresses the problems of enriching terminologies and integrating them into the application dictionaries of NLP systems. It also deals with automatic and semi-automatic construction of application terminologies from corpora.The main objective is to facilitate the use of terminological data in NLP systems thus tackling the critical issue of real site customisation of this type of software. Two classes of users are foreseen: application developers and terminology builders/administrators.

There are three major lines of action:

The elaboration of a standardised generic representation of terminological data enriched with linguistic information, and application specific knowledge derived from terminological resources.
The implementation of a modular portable toolbox allowing a) the assembly and customisation of terminological resources in order to characterize and enrich these resources, check their coherence and merge them with lexical data to create machine-processable lexico-terminological objects and b) semi-automatic terminology extraction from text.
The validation of the tools, methods and formats developed within the project by means of three real site tests involving corporate data and two smaller-scale experiments covering altogether five languages (French, Italian, English, Greek and Portuguese).

Approach and Methodology

The project is based on methods and tools already existing within the consortium, or under development. Results from related EC sponsored projects and from the EUREKA projects GRAAL and GENELEX will be used. It is complementary to GRAAL and GENELEX, which deal with the generic grammatical and lexical components of NLP systems.

The TRANSTERM toolbox will also take into account the known document description means (such as SGML) in order to facilitate both the acquisition and reuse of terminological data. Existing international norms in the field of terminology will be taken into account and links will be established with ongoing standardisation efforts in this field (like LISA TIF) and neighbouring areas (eg. the Knowledge Interchange Format). The software will be developed on a UNIX platform considering emerging standards such as OSF/Motif.

Exploitation and Future Prospects

The project is very much user driven. The industrial consortium members expect to improve the productivity of their applications, especially in the area of automatic indexing. The software toolbox will allow the construction of application specific disambiguation heuristics and descriptions of transformations of identified grammatical constructs into objects conforming to the characteristics of a terminology.

Semi-automatic construction of terminological resources in languages such as Greek and Portuguese will be supported by providing tools usable in these environments. .SP 1 TRANSTERM is expected to lead to pre-industrial prototypes which lend themselves to rapid exploitation by industrial system developers leading to marketable products. Associated services will become more cost-effective. The results of work on standardisation will be made available to the scientific and industrial communities.

The close cooperation of TRANSTERM with the related Eureka projects GRAAL and GENELEX will have a synergetic effect on Community sponsered efforts in Natural Language Processing.

Campo scientifico (EuroSciVoc)

CORDIS classifica i progetti con EuroSciVoc, una tassonomia multilingue dei campi scientifici, attraverso un processo semi-automatico basato su tecniche NLP. Cfr.: Il Vocabolario Scientifico Europeo.

È necessario effettuare l’accesso o registrarsi per utilizzare questa funzione

Programma(i)

Programmi di finanziamento pluriennali che definiscono le priorità dell’UE in materia di ricerca e innovazione.

Argomento(i)

Gli inviti a presentare proposte sono suddivisi per argomenti. Un argomento definisce un’area o un tema specifico per il quale i candidati possono presentare proposte. La descrizione di un argomento comprende il suo ambito specifico e l’impatto previsto del progetto finanziato.

Dati non disponibili

Invito a presentare proposte

Procedura per invitare i candidati a presentare proposte di progetti, con l’obiettivo di ricevere finanziamenti dall’UE.

Dati non disponibili

Meccanismo di finanziamento

Meccanismo di finanziamento (o «Tipo di azione») all’interno di un programma con caratteristiche comuni. Specifica: l’ambito di ciò che viene finanziato; il tasso di rimborso; i criteri di valutazione specifici per qualificarsi per il finanziamento; l’uso di forme semplificate di costi come gli importi forfettari.

Dati non disponibili

Coordinatore

GSI-ERLI
Contributo UE
Nessun dato
Indirizzo
1 place des Marseillais
94227 Charenton
Francia

Mostra sulla mappa

Costo totale

I costi totali sostenuti dall’organizzazione per partecipare al progetto, compresi i costi diretti e indiretti. Questo importo è un sottoinsieme del bilancio complessivo del progetto.

Nessun dato

Partecipanti (8)

Il mio fascicolo 0 0