Induction of Broad-Coverage Semantic Parsers

The lack of accurate methods for predicting meaning representations of texts is the key bottleneck for many NLP applications such as question answering, text summarization and information retrieval. Such methods would, for example, enable users to pose questions to systems relying on knowledge contained in Web texts rather than in hand-crafted ontologies.

Although state-of-the-art semantic analyzers work fairly well on closed domains (e.g. interpreting natural language queries to databases), accurately predicting even shallow forms of semantic representations (e.g. frame-semantic parsing) for less restricted texts remains a challenge. The reason for the unsatisfactory performance is reliance on supervised learning, with the amounts of annotation required for accurate open-domain parsing exceeding what is practically feasible.

In this project, we address this challenge by defining expressive statistical models which can exploit not only annotated data but also leverage information contained in unannotated texts (e.g. on the Web). Rather than modeling sentences in isolation, we will model relations between facts, both within and across different texts, and also exploit linking to facts present in knowledge bases. This ‘linked’ setting will let us to both discover inference rules (i.e. learn that one fact implies another) and induce semantic representations more appropriate for applications requiring reasoning.

We have been primarily considering unsupervised and supervised modeling in this period.

First, we developed accurate semantic parsers for PropBank semantic role labeling and abstract-meaning-representation formalisms.

Second, we developed an effective method for linking mentions of the same concept / referents across sentences in text (co-reference resolution). This step is necessary for modeling relations between facts in a document and training the model to 'read between the lines'.

We developed a toolbox methods for inducing structured prediction methods (e.g. semantic parsing) from partial or weak supervision.

Additionally, we developed methods for integrating linguistic structure (e.g. semantic representation) into linguistic-uninformed neural models. Specifically we introduced a class of graph neural networks suitable for encoding semantic properties of sentences while relying on prior knowledge represented as labeled directed graphs. We also demonstrated that graphs do not need to be predefined or produced by an off-the-shelf tool, instead they can be induced automatically from the data. In other words, we can simultaneously induce graphs and learn neural networks operating on these graphs. This method, or its variations, will be used in the next stages of the project (including jointly modeling knowledge bases and text).

Moreover, graphs are a standard way of representing structured knowledge. The graph neural network modeling and graph induction frameworks we developed have many potential applications in natural language processing beyond modeling semantics.

The key limitation of graph neural networks is their lack of interpretability. We have developed methods for analysis of graph neural networks; most of these interpretation methods are applicable to a more general class of neural architectures (e.g. Transformers).

Besides these novel contributions, we focused on preparing data and establishing benchmarks for evaluating semantic parsers.

The key areas where we achieved progress beyond the state of the art are as the following:

1. The semantic parsing models we introduced are effective and fast, and surpasses comparable methods on standard benchmarks on multiple languages. Semantic role labeling (SRL) is a crucial natural language processing (NLP) problem, availability of a simple multilingual method is an important step toward even broader use of SRL in NLP applications (and hence toward creation of even more intelligent text processing tools). Abstract meaning representation (AMR) parsers provide additional information not encoded in SRL representations (e.g. negation and modality) which is often valuable in downstream applications.

2. Besides producing an effective co-reference system, our work in this direction has wider implications. In fact, the method we introduced (specifically the loss function) can potentially be used to improve any co-reference system, for any language.

3. Deep learning, and specifically recent classes of recurrent neural networks (LSTMs), have had large impact on NLP. One of the challenges is the lack of simple and effective methods for incorporating structured information in such models (e.g. syntax or semantics). Our work on incorporating graph structure into neural models is an important step towards resolving this challenge.

Periodic Reporting for period 4 - BroadSem (Induction of Broad-Coverage Semantic Parsers)

Condividi questa pagina

Scarica