Skip to main content
European Commission logo print header

Acquisition of Lexical Knowledge for Natural Language Processing Systems

Objetivo

The ACQUILEX Action aimed at solving problems in the following areas:
-the design of a general computational model of a dictionary entry
-the degree to which machine-readable dictionaries contain lexical, syntactic and semantic information which can be made explicit and can be reused in natural language processing systems
-the development of methods and techniques for the extraction of syntactic and semantic information from machine-readable dictionaries
-the semi-automatic construction of taxonomies starting from machine-readable dictionaries
-the design and organisation of a single lexical knowledge-base with concepts and relations for the different languages involved.
Techniques and methodologies for using existing machine readable dictionaries in the construction of components for natural language processing systems were developed. The main focus was on the extraction of lexical information from multiple machine readable sources in a multilingual context, with the overall goal of constructing a single multilingual lexical knowledge base.

The results achieved so far involve:
a second release of the lexical database software;
conversion of the machine readable dictionary sources onto the lexical database (LDB);
supply of semantic taxonomies;
an English analyser system;
first releases of the lexical knowledge base software;
definition of common set of types and features structures.

Research is currently being undertaken on:
continuing the construction of taxonomies for the different monolingual dictionaries;
merging of taxonomies derived from different dictionaries;
implementing the lexical knowledge base system;
definition of a system of typed feature structures for the formalization of syntactic and semantic information;
development of a prototype of the natural language processing test bed system.
APPROACH AND METHODS
The approach was based on theoretical linguistics, computational lexicography and lexicology, and computational linguistics. The main themes were:
-A formal analysis of the structure and content of existing machine-readable dictionaries, in order to design an explicit and standardised representation language for the computational model of the dictionary entry.
-The design of procedures for the extraction of super-ordinates from natural language definitions, for their disambiguation, and for the construction of taxonomies throughout the lexicon.
-The design of procedures for the linguistic and computational analysis of natural language definitions, with the aim of extracting all the semantic information implicit in them.
-The study of ways of representing the semantic and syntactic information which is extracted in a unification-based lexical representation language using typed feature structures and supporting default inheritance.
-The design and implementation of basic software for the creation, accessing and processing of lexical databases and lexical knowledge-bases.
-The study of how to link taxonomies and conceptual or relational information coming from different sources (either monolingual or multilingual).
-The design of a natural language processing test-bed for the information extracted from machine-readable dictionaries.
PROGRESS AND RESULTS - STATUS AS OF OCTOBER 1991
The results achieved in the last year are:
-a second release of the lexical database software
-conversion of the machine-readable dictionary sources onto the LDB
-supply of semantic taxonomies
-English analyser system
-first releases of the Lexical Knowledge Base Software
-definition of common set of types and feature structures
The Action is presently working on:
-continuing the construction of taxonomies for the different monolingual dictionaries;
-merging of taxonomies derived from different dictionaries;
-implementing the lexical knowledge base system;
-definition of a system of typed feature structures for the formalisation of syntactic and semantic information;
-development of a prototype of the natural language processing test-bed system.
POTENTIAL
The research themes tackled within ACQUILEX aimed to meet one of the major bottlenecks of natural language processing: the availability of large computational lexicons with particular emphasis on making semantic information explicit and accessible.

Tema(s)

Data not available

Convocatoria de propuestas

Data not available

Régimen de financiación

Data not available

Coordinador

Università degli Studi di Pisa
Aportación de la UE
Sin datos
Dirección
Via Risorgimento 9
56126 Pisa
Italia

Ver en el mapa

Coste total
Sin datos

Participantes (6)