Skip to main content
European Commission logo
English English
CORDIS - EU research results
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary
Content archived on 2024-04-16

Natural Language processing of PAtient Discharge summaries

Objective

The dramatic expansion of health services over the past several decades has created a growing demand for fast and easy access to relevant data. For a number of applications, such a ongoing patient care, evaluation of health care, and clinical research, the most important information about the patient medical history is mainly available in narrative patient discharge summaries (PDS). The overall goal of this project is to provide better access to this information through the development of a natural language (NL) processing system applying medical knowledge to perform two kinds of services :
a) intelligent automatic extraction of the data from the PDS in order to store them in an appropriate database and
b) intelligent NL retrieval of medical data from that database, thus allowing user to access quickly and easily the information.
A project was created to provide better access to information about the patient medical history contained in a narrative patient discharge summary (PDS), through the development of a natural language (NL) processing system applying medical knowledge to perform two kinds of services:
intelligent automatic extraction of the data from the PDS in order to store them in an appropriate database (extraction prototype);
intelligent NL retrieval of medical data from that database. The extraction is considered here. The feasibility of this was evaluated by designing and testing a small prototype performing information extraction from PDSs in a restricted medical domain: therapy management of thyroid chancer. A preliminary study was carried out. Physicians' information needs were studied. A lexical study was carried out to measure the size of the vocabulary involved. The syntactic characteristics revealed in the PDS were also studied. The next step was in a computer format to identify and describe the various types of domain models needed by the text understanding system. A knowledge representation formalism was set up which described knowledge as well as a semantic lexicon for the test domain. At the lexical level, each word was described in terms of a set of semantic components. A situational model of the information conveyed by the PDS sentence has been built. For testing purposes, a simple natural language generation program has been connected to the conceptual extraction module. The syntactic and conceptual modules were finally integrated into a unique extraction prototype, and this prototype was tested on a selected corpus of typical PDS sentences. These tests showed that the prototype extraction system obtained a fair rating according to the size of its knowledge base.
The proposal rests under several hypotheses to be evaluated in the pilot phase of AIM :
- Intelligent, knowledge based, NL processing can solve medical information needs ;
- NL technology has now reached a point where it can be applied to the medical domain with serious chances of success ;
- both extraction of information from narrative PDS and subsequent NL retrieval of information require an extensive knowledge base of the medical domain covered by the system ;
- the same kind of knowledge can be used both for extraction and retrieval, thus requesting only one knowledge base for both components of the system.
The Exploratory Phase will evaluate these hypotheses by designing and testing a small prototype performing both information extraction from PDS and processing of NL requests in a selected medical specialty. One important characteristic is that we shall make use, both for information extraction and information retrieval, of existing prototypes of NL processing systems, a fact which will allow us to test quickly important hypotheses without excessive implementation work. As application domain, we chose therapy management of thyroid cancer, a domain where knowledge is well defined and is almost independent of other medical domains, thus enabling us to work in a "close world", and thus to test more easily methods that make extensive use of domain knowledge. The results of the pilot phase will include feasibility conclusions on the prototyping experiments and on the porting of the system to other medical domains.
Main Deliverables :
Prototype of an automatic system for the extraction of information from PDS and a prototype of an information retrieval system using natural language. The project will also produce reports on knowledge and information needs.

Fields of science (EuroSciVoc)

CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques.

You need to log in or register to use this function

Topic(s)

Data not available

Call for proposal

Data not available

Funding Scheme

Data not available

Coordinator

BIM S.A./N.V.
EU contribution
No data
Address


Belgium

See on map

Total cost
No data

Participants (1)