Unlocking electronic medical records to revolutionise health research

Using artificial intelligence and natural language processing, a new project aims to accelerate health science and precision medicine.

Health

Over the past two decades, funding for biomedical research has more than doubled, yet approvals for new drugs have dropped by a third. The general consensus is that there is a need for more precision medicine(opens in new window), with targeted drugs and treatments. A search for underlying mechanisms behind diseases could help. Such a breakthrough could come from electronic health records (EHRs), filled with troves of information generated by clinicians during general practice. EHRs show the real way clinicians approach problems, subject to their working environments. These vital pieces of information are often written in unstructured text, which is difficult to gather information from accurately, at scale. The EU-supported SAVANA project, hosted by Savana Médica in Spain, developed a means of harnessing a specific branch of artificial intelligence, known as clinical natural language processing(opens in new window) (NLP), to capture the value from within this vast amount of text. Savana(opens in new window)’s CEO and founder, Jorge Tello explains: “Just think of the amount of valuable data written on electronic health records? Now multiply that data by thousands of clinical documents each clinical institution generates, in dozens of countries. Can you imagine the potential that all that information would offer to clinicians, managers and researchers? The benefit to science and ultimately for patients is undeniable.” Recent events have put that imagination into practice. The BigCOPData(opens in new window) project allowed Savana to use its technology to become especially precise in reading respiratory records. When the COVID-19 pandemic started, Savana was able to start its Big COVIData(opens in new window) study as soon as the first disease outbreak ravaged Europe, which defined the clinical characteristics and predictive factors for patients with COVID-19.

Clinical natural language processing

Savana has developed EHRead©, a technology based on the reuse of all the Big Data contained in the free text in EHRs. It applies clinical NLP and deep learning techniques to provide a large-scale, comprehensive system designed to automatically process and structure information from the EHR’s free text to support clinical research and practice. EHRead© can automatically process anonymised, de-identified, unlinked patient text documents from EHRs in several languages, broadening the European scope of the project further. The team also developed a system called Natural Privacy, a method of generating databases from unstructured texts that protects individual privacy. This extra security layer means privacy is preserved to a greater extent than in competitor models. The programme allows researchers to conduct observational, retrospective studies. It also enables them to make correlations between clinical variables. This helps users identify prognostic factors of disease progression, check the efficacy of pharmacological treatments and predict available health resources.

At the cutting edge of healthcare

At present, researchers are conducting studies worldwide in 17 therapeutic areas using the EHRead© technology, including oncology, pneumology and cardiology. “For us, the achievements of the company are the important results each one of the clinicians and researchers obtain using Savana’s tool. This includes research in many different medical areas, from multiple myeloma, to heart and kidney failure for Type 2 diabetes or multiple sclerosis,” says Ignacio H. Medrano, Savana’s chief medical officer and founder. At present, two international respiratory studies are being run, enabling high-quality collaborative research and cutting-edge investigation on chronic obstructive pulmonary disease and COVID-19 to improve the lives of people living with respiratory conditions.