Skip to main content
European Commission logo print header

Generating metaphors using a combination of AI reasoning and corpus-based modeling of formulaic expressions

Final Report Summary - GEN-META (Generating metaphors using a combination of AI reasoning and corpus-based modeling of formulaic expressions)

Metaphor is a prominent research focus in many fields: linguistics, literature, philosophy, psychology, psychiatry, education, business, politics, etc. Metaphor is important in all sorts of mundane discourse: ordinary conversation, news articles, popular novels, advertisements, etc. Issues of prime human interest -- such as relationships, money, disease, states of mind, passage of time -- are often most economically and understandably conveyed through metaphor.

For a phenomenon as central to our lives as metaphorical language, the matter of how it is produced should be a priority within the cognitive sciences, one would expect. Indeed, work on the production of metaphorical language has been carried out in areas as diverse as developmental studies of children's thought, psycholinguistics, foreign language learning, neurolinguistics and artificial intelligence. Yet, the research done so far on metaphor production is, in total, minor compared to the amount of research that has been done on how metaphor is understood, and we are even further from an adequate account of metaphor generation than we are from an adequate account of metaphor understanding (even though the latter topic is itself still full of problems). There are, for instance, important open questions as to what governs speakers' particular (though probably usually unconscious) choices about whether to say something metaphorically at all, and, if so, about which familiar metaphorical way(s) of looking at things to use, which (if any) particular familiar metaphorical phrases to use, how to vary the familiar phrasing, etc.

The ubiquity of metaphor presents an important challenge, in particular, to how artificial intelligence (AI) systems can both understand and produce metaphor, so as to take part naturally in communication with people and so as to understand human-human communication better. The GenMeta project (http://www.cs.bham.ac.uk/~gargetad/genmeta-index.html) is an AI study with special focus on metaphor generation but also looking at metaphor understanding. Indeed, it links these two matters in a unique way.

The overall scientific objectives of the project are as follows:

(A) Create, in prototype and partial form, a natural language understanding and generation system, orientated towards dialogue, using state-of-the-art generation mechanisms, and focusing mainly on metaphor.

(B) Include, in that system, some initial mechanisms for choosing: whether to use metaphor at all; particular metaphorical ways of looking at the subject matter at hand; and particular metaphorical words or phrases.

(C) Include some initial provisions in that system for creating natural forms of metaphorical expression as revealed by study of language corpora.

In the project, the understanding side of the prototype system has been created partly by bridging between two existing systems, the Embodied Construction Grammar (ECG) system developed at the University of Berkeley, California, and our own pre-existing ATT-Meta system that includes meaning-representation facilities and inference mechanisms needed for metaphor understanding, but that did not previously take actual natural-language sentences as input. ECG provides this link to sentences. The generation side of the prototype system has been created by connecting ATT-Meta to a system developed at Pompeu Fabra university in Barcelona that takes meaning representations and uses them to create sentences expressing the meanings. The connection developed translates from ATT-Meta-style meaning representations into the style used by the Barcelona system.

These developments provide an initial proof-of-concept for (a) a system for metaphor understanding going beyond, in terms of depth and flexibility of understanding, the capabilities of the (few) other metaphor understanding systems that have been developed, together with (b) almost uniquely, a system that can take information about a system and create metaphorically expressed linguistic output using whatever metaphorical ways of thinking are familiar to the system from the point of view of understanding of metaphor.

In addition, (a) and (b) are linked in a unique and deep way not immediately apparent from the above. Intuitively, metaphor casts something A as something else B (e.g. in thinking of money as a liquid in a statement like "Money leaked out of his bank account"). Accordingly, metaphor understanding is usually thought of as going from a representation of the literal meaning of the sentence (in our example, the ridiculous one that the money in question is acting as a liquid) to the actual meaning (namely, say, that the amount of money in the account has been gradually reducing for some extraneous reason). Thus, there are mechanisms for converting information from the "source" domain of the metaphor (liquid) to the "target" domain. However, the ATT-Meta system has the unique property that, in service of this ultimate goal, on the way it sometimes profits from converting information in the opposite direction. But this opposite direction is precisely what is needed for generating metaphorical sentences given initial non-metaphorical meaning representations. Thus, by basing (a) and (b) above on ATT-Meta, we achieve an unprecedently tight connection between understanding and generation.

Another aspect of the understanding side of the system has also led to significant results. A major part of the project has been devoted to the task of detecting the presence of metaphor in text. Distinctive contributions here include a demonstration of how the performance of metaphor detection can be boosted by combining information about imageability, concreteness and emotionality/evaluativeness of text, and also incorporating additional information about grammatical structure of the sentence.

To help the above work, but also as a free-standing output of the project, new annotated corpora (large collected bodies) of text that include instances of metaphor have been created. For example, the PoliCon corpus was constructed. This is a corpus of metaphors in political conflict discourse, collected from online forums involving debates about politically controversial topics, including immigration, security and climate change. These texts have been both manually and automatically annotated with various types of information that should be useful to other metaphor researchers or to people interested in other ways in the nature of political language. The project has also proceeded similarly on a corpus of illness metaphors.

The project has not aspired to producing a fully operational metaphor understanding/generation system that could be immediately plugged into applications -- this would have been a hugely over-ambitious goal to accomplish in a mere two years. Rather, it has aimed at developing proofs of concept, as specified above. Nevertheless, the work has made some significant steps along the road towards greatly increasing the relevance and usefulness of automated natural language processing in a range of everyday activities. Expected benefits of this trajectory of research include, for example, helping to improve the inclusion of marginalized people in the digital economy of Europe (by improving the naturalness with which computer interfaces can communicate), and also helping to improve language teaching technology.