Skip to main content
European Commission logo print header
Contenuto archiviato il 2022-12-23

CENTRAL EUROPEAN GENELEX MODEL

Obiettivo

This project aims at extending the generic electronic dictionary model (and accompanying SGML DTD) developed in the EUREKA GENELEX project to three central European languages. The Consortium will work in close cooperation with the lexicon Working Group of the European Commission's EAGLES initiative. The GENELEX model is being currently studied by the mentioned WG. It should be noted that the Consortium's coordinator is also the host organisation of the above mentioned EAGLES WG.

The goals of the joint research project are to :

- give rapid access to a Western European Linguistic Engineering pre-standard model for Central European actors of the NLP scene,
- extend the GENELEX model to new languages, evaluating its appropriateness and making it a stronger candidate for being an internationally recognised standard,
- start a larger cooperation between the Partners leading to industrial level applications.

The work will begin by identifying theoretical issues that in Czech, Hungarian and Polish may lead to specific adaptations and extensions of the model. Two validation steps are foreseen for all languages, they will consist in first building a representative core lexicon that will conform to the elaborated model and then in the second step verify the possibility to use the core dictionary in the context of an application. The work on the project is organised along three parallel lines. Each line of work addresses one language and is composed of 3 steps: (1) elaboration of a conceptual data model, (2) construction of a core dictionary, (3) validation through an application.

The main actor in every line of action is the local language specialist Partner. The other Partners act as observers and "consultants" on some potential difficulties. Such an organisation allows to share as early as possible partial results. The deliverables of the project will be reports and SGML encoded core dictionaries. Since all the Partners have a working experience in several languages care will be taken to take into account as much as possible specificities of Slavic/Ugro-Finnish languages in general. The final report will thus include a section on that general topic.This project is viewed as a first step in a larger further cooperation that will extend to:

- the construction of full blown Generic electronic dictionaries for all types of Natural Language Processing applications in the three languages,
- producing together industrial level applications in these languages,
- work with already additional identified Partners on other Central and Eastern European languages.

It is however hoped, that already the initial results of the work will produce a larger impact outside of the consortium. Efforts will be made to submit the resulting models to national standardisation bodies. It is thus hoped to speed up the maturing process of Language Engineering in three Central European states.

Argomento(i)

Data not available

Invito a presentare proposte

Data not available

Meccanismo di finanziamento

CSC - Cost-sharing contracts

Coordinatore

GSI - ERLI S.A
Contributo UE
Nessun dato
Indirizzo
Place des Marseillais 1
94227 Charenton Le Pont
Francia

Mostra sulla mappa

Costo totale
Nessun dato

Partecipanti (3)