Skip to main content
Ir a la página de inicio de la Comisión Europea (se abrirá en una nueva ventana)
español español
CORDIS - Resultados de investigaciones de la UE
CORDIS
Contenido archivado el 2024-05-07

BUILDING A MULTILINGUAL WORDNET DATABASE WITH SEMANTIC RELATIONS BETWEEN WORDS

Objetivo

EUROWORDNET will produce a multilingual database for use in a variety of applications, including machine-aided translation and quality information retrieval. The database will establish basic semantic relations between words for several European languages. The wordnets will then be linked to the American wordnet for English to derive a shared top-ontology. In providing easy access to words and related meanings, the resources so obtained will enable terms employed in user enquiries to be expanded to any set of closely related terms in a language, resulting in better retrieval of information in terms of recall.
http://www.let.uva.nl/ccl/EuroWordNet.html(se abrirá en una nueva ventana)
The database has been designed with a number of important characteristics:
- it is multi-lingual,
- it can handle language-specific information extracted from diverse resources,
- it provides a formal system usable in information retrieval applications as well as in the development of more complex knowledge bases of the future.
The database also has some innovative features, such as:
- an 'interlingual index', a pool of concepts (the superset of all language-specific concepts), where all shared language independent information is stored,
- a facility to label relations such as disjunction, conjunction, factivity, reversal and negation,
- encoding of explicit semantic relations across parts-of-speech,
- different interpretations of particular WordNet 1.5 relations,
- addition of new relations.
For each relation language, specific test sentences are provided, with examples, to verify the relations between word pairs. Public guides for coding the semantic relations in each language will be published. The manuals provide a check-list and explain how the test should be applied to derive the semantic relations between word-meanings. Finally, the database will have a specialised interface to cope with the complexity of a multi-lingual semantic database.
The coding of the first subset of the most fundamental meanings, so called 'base concepts', is in progress. Base concepts are used to define any more specific concepts and their meanings in the languages involved. They have the most relations and occupy major positions in semantic hierarchies or taxonomies. A common set of base concepts has been defined on the basis of having similar criteria across all the languages involved (English, Dutch, Italian and Spanish).
The User Requirement and the Market
As direct user-involvement in the project is quite modest, a larger user group, currently with 35 members, of interested companies and institutions has been established. The purpose of this group is to create a wider awareness of the use of this type of resource and to establish co-operation with other groups with a common interest. Each member of the user group receives key deliverables and data samples and is asked to provide feedback.
A number of different types of user have been identified:
- publishers, interested either in providing the initial resources or in the development of similar products (dictionaries, thesauri etc.),
- research institutes and R&D departments of companies working in the field of knowledge engineering or linguistic databases,
- organisations using or applying similar resources in the development of services or products which need multi-lingual semantic resources, such as WWW search engines,
- end-users interested in products helping them to manage their information resources.
The Way Ahead
The core of the database will be finished and extended with an official version of the merged top-ontology and the results will be verified.
The aim is to have produced a rich and high quality coding of semantic relations and equivalence relations for a common set of about 5,000 base concepts in the four languages by mid-97. By the end of the year this first subset will have been verified and available for testing. The resources will be tested and demonstrated in an information retrieval system by Novell, one of the partners in the consortium.
Discussion fora and workshops have been arranged, where the project will lead discussion on the design, validation and standardisation of multi-lingual semantic databases.

Ámbito científico (EuroSciVoc)

CORDIS clasifica los proyectos con EuroSciVoc, una taxonomía plurilingüe de ámbitos científicos, mediante un proceso semiautomático basado en técnicas de procesamiento del lenguaje natural. Véas: https://op.europa.eu/es/web/eu-vocabularies/euroscivoc.

Para utilizar esta función, debe iniciar sesión o registrarse

Programa(s)

Programas de financiación plurianuales que definen las prioridades de la UE en materia de investigación e innovación.

Tema(s)

Las convocatorias de propuestas se dividen en temas. Un tema define una materia o área específica para la que los solicitantes pueden presentar propuestas. La descripción de un tema comprende su alcance específico y la repercusión prevista del proyecto financiado.

Convocatoria de propuestas

Procedimiento para invitar a los solicitantes a presentar propuestas de proyectos con el objetivo de obtener financiación de la UE.

Datos no disponibles

Régimen de financiación

Régimen de financiación (o «Tipo de acción») dentro de un programa con características comunes. Especifica: el alcance de lo que se financia; el porcentaje de reembolso; los criterios específicos de evaluación para optar a la financiación; y el uso de formas simplificadas de costes como los importes a tanto alzado.

CSC - Cost-sharing contracts

Coordinador

University of Amsterdam
Aportación de la UE
Sin datos
Dirección
Spuistraat 134
1012 VB Amsterdam
Países Bajos

Ver en el mapa

Coste total

Los costes totales en que ha incurrido esta organización para participar en el proyecto, incluidos los costes directos e indirectos. Este importe es un subconjunto del presupuesto total del proyecto.

Sin datos

Participantes (3)

Mi folleto 0 0