Skip to main content
European Commission logo print header

Ontology driven analysis of nominal systematic polysemy in WordNet

Final Report Summary - ODASP (Ontology driven analysis of nominal systematic polysemy in WordNet)

Polysemy is pervasive in every natural language, ranging from accidental (local) polysemy due to the particular historical evolution of a language to more systematic forms of polysemy reflecting deeper cognitive principles of knowledge organization. This raises notorious difficulties for any system managing linguistically encoded information, affecting natural language processing application but also more generally any systems that manages semantic information. In particular, in many knowledge management tasks, it has become common practice to use lexical databases, structured in terms of semantic relations (hyponymy, mernonymy, etc.), to represent knowledge on the domain, viz. as an ontology. Achieving a principled treatment of polysemy, and in particular its more systematic, ontologically and cognitively grounded forms, is thus a major challenge.

In this perspective, the ODASP project addresses the particular difficulties raised by the treatment of inherent nominal systematic polysemy in WordNet, an English lexical database of wide coverage that is currently extensively used for linguistic but also non-linguistic applications. In fact, WordNet’s lack of a principled treatment of this kind of polysemy is notoriously problematic, resulting in conceptual incoherences and confusion, and raising serious difficulties for information systems using WordNet as an ontology. In particular, the presence of multiple-inheritance in WordNet is quite problematic when the hyponym/hypernym relation is interpreted as a subsumption relation (‘IS A’ for short), a problem that has been identified as the “IS A overloading problem”. The ODASP project tackles this issue in an ontology based approach, capitalizing –on the one hand– on the rich philosophical tradition that investigates the notions of ontological dependence and property inheritance, and –on the other hand– on the results of theoretical linguistic investigations.

To achieve its objectives ODASP has produced research results at the intersection of the fields of theoretical linguistics, lexical resources and ontologies for information systems, leveraging on the conceptual structures studied in the field of philosophical ontology. These results constitute the material of international journal and conference articles, and have given the occasion to set up new networks of collaborations.

* Among the strategies of WordNet to represent inherent nominal systematic polysemy is the use of multiple inheritance. While this introduces notorious structural inconsistencies, this is also a feature that has been exploited by ODASP, combined with measures of semantic similarity, to identify the proportion of WordNet that is affected by this kind of polysemy.

* To solve the inconsistencies induced by nominal systematic polysemy, ODASP proposes to introduce a novel kind of semantic relation in WordNet: a constitution relation (and its “dot” counterpart). This implies carefully distinguishing between complex meanings that constitute a separate semantic hierarchy, from the simple meanings that compose them, and that fall under the categories of the existing top-level of WordNet.

* ODASP proposes a formal ontological theory of the constitution/dot relation, that is, an axiomatic representation of the relation based on DOLCE high-level ontology. Used as an ontological backbone of WordNet, this formal theory provides the required tools to make semantic inferences.

* While there are other ontologies based on WordNet’s structure which propose to inventory complex categories corresponding to patterns of systematic polysemy (CoreLex being the most influential), they present significant drawbacks due to the purely distributional methods employed. In particular, (i) they preserve WordNet’s structure and thus inherit its inconsistencies, and (ii) they do not distinguish among different kinds of systematic polysemy that actually reflect different ontological relations and constraints. ODASP avoids these two significant limitations, by adopting a top-down ontology driven approach, based on prior careful conceptual analysis of the ontological constraints and relations underlying the semantic phenomenon of inherent nominal systematic polysemy.

* The research conducted reveals the conceptual categorization problem underlying the lexical phenomenon of inherent nominal systematic polysemy. In so doing, it establishes the relevance of the obtained results for ontologies in general, beyond the limited range of linguistic ontologies. The ontology based approach, adopted for the study of these linguistic phenomena, is thus expected to find important applications beyond the scope of lexical ontologies and WordNet in particular. It provides for principles of categorization that can improve the performance of many ontology driven knowledge management systems (e.g. ODASP has studied the particular case of the Unified Medical Language System, an ontology widely used by biomedical information systems and services).

* Finally, ODASP was engaged in knowledge transfer at the University level through collaboration with the University of Pavia and the co-supervision of one MSc student at the University of Trento on ontological and corpus based methods for the study of systematic polysemy. Knowledge transfer was also pursued at the European and international level, through research collaborations with the IRIT laboratory of the CNRS in Toulouse (France), and the Princeton team in charge of the english WordNet in the United States.