Deep neural networks (DNNs) have become a critical tool in natural language processing (NLP) for a wide variety of language technologies, from syntax to semantics to pragmatics. In particular, in the field of natural language inference (NLI), DNNs have become the de-facto model, providing significantly better results than previous paradigms. Their power lies in their ability to embed complex language ambiguities in high dimensional spaces coupled with non-linear compositional transformations learned to directly optimize task-specific objective functions. We propose to adapt Deep NLI techniques to the biomedical domain, specifically investigating question answering, information extraction and synthesis. The biomedical domain presents many key challenges and a critical impact that standard NLI challenges do not posses. First, while standard NLI data sets requires a system to model basic world knowledge (e.g., that ‘soccer’ is a ‘sport’), they do not presume a rich domain knowledge encoded in various and often heterogeneous resources such as scientific articles, textbooks and structured databases. Second, while standard NLI data sets presume that the answer/inference is encoded in a single utterance, the ability to reason and extract information from biomedical domains often requires information synthesis from multiple utterances, paragraphs, and even documents. Finally, whereas standard NLI is a broad challenge aimed at testing whether computers can make general inferences in language, biomedical texts are a grounded and impactful domain where progress in automated reasoning will directly impact the efficacy of researchers, physicians, publishers and policy makers.
Field of science
- /natural sciences/computer and information sciences/data science/natural language processing
- /natural sciences/computer and information sciences/artificial intelligence/computational intelligence
- /humanities/languages and literature/languages - general
Call for proposal
See other projects for this call