BUILDING A MULTILINGUAL WORDNET DATABASE WITH SEMANTIC RELATIONS BETWEEN WORDS

Project Information

EUROWORDNET

Grant agreement ID: LE24003

Project closed

Start date 1 March 1996

End date 28 February 1999

Funded under

Specific programme of research and technological development and demonstration in the area of telematic applications of common interest, 1994-1998

Total cost

€ 1 350 070,00

EU contribution

€ 900 000,00

900 000,00

450 070,00

Coordinated by

University of Amsterdam
Netherlands

Objective

EUROWORDNET will produce a multilingual database for use in a variety of applications, including machine-aided translation and quality information retrieval. The database will establish basic semantic relations between words for several European languages. The wordnets will then be linked to the American wordnet for English to derive a shared top-ontology. In providing easy access to words and related meanings, the resources so obtained will enable terms employed in user enquiries to be expanded to any set of closely related terms in a language, resulting in better retrieval of information in terms of recall.
http://www.let.uva.nl/ccl/EuroWordNet.html(opens in new window)
The database has been designed with a number of important characteristics:
- it is multi-lingual,
- it can handle language-specific information extracted from diverse resources,
- it provides a formal system usable in information retrieval applications as well as in the development of more complex knowledge bases of the future.
The database also has some innovative features, such as:
- an 'interlingual index', a pool of concepts (the superset of all language-specific concepts), where all shared language independent information is stored,
- a facility to label relations such as disjunction, conjunction, factivity, reversal and negation,
- encoding of explicit semantic relations across parts-of-speech,
- different interpretations of particular WordNet 1.5 relations,
- addition of new relations.
For each relation language, specific test sentences are provided, with examples, to verify the relations between word pairs. Public guides for coding the semantic relations in each language will be published. The manuals provide a check-list and explain how the test should be applied to derive the semantic relations between word-meanings. Finally, the database will have a specialised interface to cope with the complexity of a multi-lingual semantic database.
The coding of the first subset of the most fundamental meanings, so called 'base concepts', is in progress. Base concepts are used to define any more specific concepts and their meanings in the languages involved. They have the most relations and occupy major positions in semantic hierarchies or taxonomies. A common set of base concepts has been defined on the basis of having similar criteria across all the languages involved (English, Dutch, Italian and Spanish).
The User Requirement and the Market
As direct user-involvement in the project is quite modest, a larger user group, currently with 35 members, of interested companies and institutions has been established. The purpose of this group is to create a wider awareness of the use of this type of resource and to establish co-operation with other groups with a common interest. Each member of the user group receives key deliverables and data samples and is asked to provide feedback.
A number of different types of user have been identified:
- publishers, interested either in providing the initial resources or in the development of similar products (dictionaries, thesauri etc.),
- research institutes and R&D departments of companies working in the field of knowledge engineering or linguistic databases,
- organisations using or applying similar resources in the development of services or products which need multi-lingual semantic resources, such as WWW search engines,
- end-users interested in products helping them to manage their information resources.
The Way Ahead
The core of the database will be finished and extended with an official version of the merged top-ontology and the results will be verified.
The aim is to have produced a rich and high quality coding of semantic relations and equivalence relations for a common set of about 5,000 base concepts in the four languages by mid-97. By the end of the year this first subset will have been verified and available for testing. The resources will be tested and demonstrated in an information retrieval system by Novell, one of the partners in the consortium.
Discussion fora and workshops have been arranged, where the project will lead discussion on the design, validation and standardisation of multi-lingual semantic databases.

Fields of science (EuroSciVoc)

CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

Programme(s)

Multi-annual funding programmes that define the EU’s priorities for research and innovation.

FP4-TELEMATICS 2C - Specific programme of research and technological development and demonstration in the area of telematic applications of common interest, 1994-1998

Topic(s)

Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

D.12 - Language Engineering

Call for proposal

Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

Data not available

Funding Scheme

Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

CSC - Cost-sharing contracts

Coordinator

University of Amsterdam

EU contribution

No data

Address

Spuistraat 134
1012 VB Amsterdam
Netherlands

Total cost

No data

Participants (3)

FUNDACION UNIVERSIDAD EMPRESA

Spain

EU contribution

No data

Address

5,SERRANO JOVER
28015 MADRID

Total cost

No data

Istituto di Linguistica Computazionale

Italy

EU contribution

No data

Address

Via della Faggiola 32
56126 Pisa

Total cost

No data

NOVELL BELGIUM N/V

Belgium

EU contribution

No data

Address

1,POSTHOFLEI
2600 ANTWERPEN-BERCHEM

Total cost

No data

Objective

Fields of science (EuroSciVoc) CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

Programme(s) Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s) Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Call for proposal Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

Funding Scheme Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

Coordinator

Participants (3)

Download Download the content of the page

Fields of science (EuroSciVoc)

CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

Programme(s)

Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s)

Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Call for proposal

Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

Funding Scheme

Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.