Objective
The project builds on and extends the results achieved in ACQUILEX by continuing research on theoretical issues in the design of lexicons and constructing further and more substantial monolingual and multilingual knowledge base fragments on the basis of a mixture of MRDs and manual encoding. At the same time, the project intends to make considerable use of corpora as a further source of data for the semi-automatic construction of lexical resources. Substantial quantities of textual and spoken-transcribed corpora are rapidly becoming available within the academic and dictionary publishing communities. Whilst MRDs provide a highly-structured and focused source of lexical data, substantial corpora can supplement this information with information concerning usage, frequency, and so forth. The proposed project will develop the software tools required to enable efficient use of corpora and will utilise them in the development of dictionary databases and the lexical knowledge base. In addition, we plan to tap the expertise of professional lexicographers and of the dictionary publishing industry far more directly in the new project in the investigation of theoretical issues and by transfering the tools, techniques and insights of computational lexicology and lexicography to that community.
The research concerns the acquisition of lexical knowledge for natural language processing systems, semiautomatically from machine readable versions of conventional dictionaries (MRDs) for English, Spanish, Italian and Dutch.
The project involves research on theoretical issues in the design of lexicons and constructing further and more substantial monolingual and multilingual knowledge base fragments on the basis of a mixture of MRDs and manual encoding.
A more sophisticated lexical representation language capable of dealing with more complex defeasible phenomena, such as blocking, has been developed and a formal semantics for this formalism specified. Corpus analysis tools have been developed for collocational analysis and lexical tagging. Further multilingual lexicon fragments exemplifying these developments are being constructed.
APPROACH AND METHODS
Work on ACQUILEX can be divided into two areas: the development of a methodology and the construction of software tools to create lexical databases from MRDs and the subsequent construction of illustrative theoretically-motivated, lexical knowledge base fragments from these databases, using a further set of software tools designed to integrate, enrich and formalise the database information. The emphasis of effort in ACQUILEX was on the development of lexical databases from MRDs and the design and implementation of a lexical representation language to underpin the lexical knowledge base. In this project, the emphasis is on the exploitation of these databases and the lexical knowledge base framework in the construction of more and more substantial lexicon fragments, and on the investigation of theoretical issues in lexicon design within the context of the unique research environment provided by the lexical databases and analysed corpora.
POTENTIAL
The project aims to foster productive collaboration between the computational linguistic and lexicographical community. We envisage that the proposed research and training activities will contribute to improvements in the quality of (particularly bi/multilingual and learners') dictionaries, to improvements in the productivity of the dictionary publishing industry and will provide impetus to electronic publishing initiatives, in addition to the central goal of producing a multilingual lexical knowledge base which can be deployed in habitable and practical natural language processing applications.
Fields of science (EuroSciVoc)
CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.
CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.
- natural sciences computer and information sciences software
- natural sciences computer and information sciences databases
- natural sciences computer and information sciences data science natural language processing
- social sciences economics and business economics production economics productivity
You need to log in or register to use this function
We are sorry... an unexpected error occurred during execution.
You need to be authenticated. Your session might have expired.
Thank you for your feedback. You will soon receive an email to confirm the submission. If you have selected to be notified about the reporting status, you will also be contacted when the reporting status will change.
Programme(s)
Multi-annual funding programmes that define the EU’s priorities for research and innovation.
Multi-annual funding programmes that define the EU’s priorities for research and innovation.
Topic(s)
Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.
Data not available
Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.
Call for proposal
Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.
Data not available
Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.
Funding Scheme
Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.
Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.
Data not available
Coordinator
CB2 1TN CAMBRIDGE
United Kingdom
The total costs incurred by this organisation to participate in the project, including direct and indirect costs. This amount is a subset of the overall project budget.