European Commission logo
English English
CORDIS - EU research results
Content archived on 2024-05-28

Compositional Operations in Semantic Space

Final Report Summary - COMPOSES (Compositional Operations in Semantic Space)

Two unique aspects subtend the human ability to communicate using language. On one hand, any adult speaker of a language is familiar with the meaning of thousands of its words (including, for example, "pink", "elephant", "drive" and "car"). On the other hand, we are able to compose the meanings of single words to produce and understand the meaning of an infinity of new linguistic expressions (for example, "pink elephants drive pink cars").
Computer scientists have made much progress in developing automated systems endowed with human-like capabilities to learn distributed word meanings on a massive scale from simple statistical patterns in large collections of text (such methods allow, for example, search engines to respond to keyword queries with relevant information). On the other hand, linguists and philosophers have developed a "grammar" of composition that can derive the meaning of a phrase or sentence from the meaning of its parts. However, until now these research traditions have been largely disjoint.
The COMPOSES project brought together, for the first time, methods from computer science for large-scale statistical learning of word meaning with ideas from theoretical linguistics on how to combine these single-word meanings to derive phrases and sentences, in order to develop a novel computational system with truly human-like meaning comprehension abilities.
The COMPOSES system, trained on natural linguistic data without any manual supervision, is able to provide accurate paraphrases of phrases and simple sentences: It proposes "false belief" as a paraphrase of "fallacy", "promises before election" as another way to express the concept of "pre-election promises" and so on. The system is able to translate similar phrases from one language to another, and it has learned how to process phrases with complex structures (for example, it can tell that "rapid social change" sounds more natural than "social rapid change", and that a "live fish transporter" is more likely to refer to a transporter of live fish than to a fish transporter who is not dead). The COMPOSES system can also tell whether a newly coined phrase makes sense or not (both "peaceful attorney" and "parallel biscuit" are new phrases, but it is easier to assign a sensible meaning to the former than the latter). Finally, the COMPOSES system understands how composition affects simple forms of reasoning (a "parrot" is a "pet", a "green parrot" is still a "pet", but how about a "dead parrot"?).
The COMPOSES approach has been extended to handle longer sentences, and in particular to capture the inferences they afford (from "a little boy is playing in the kitchen" you can infer that "the apartment is not empty", but not that "there is no little boy playing in the kitchen", despite the fact that, on the surface, the third sentence is more similar to the first than the second is). At the other end of the composition scale, our methods have also been successfully applied to deriving word meanings from their components, and can predict, for example, that, while neither "harassable" nor "windowist" are attested English word forms, subjects have an easier time assigning a coherent meaning to the first than to the second form.
We believe that our computational approach can shed light on how humans learn and use compositional competence at the semantic level. In future research, we would like to scale up the approach beyond generic sentence meaning to language use in real-life contexts, thus looking at grounding in different modalities and the function of words and utterances in a broad linguistic discourse setup.